-
Notifications
You must be signed in to change notification settings - Fork 741
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Prerequisites
- Check that your issue is not already filed:
https://github.com/leanprover/lean4/issues - Reduce the issue to a minimal, self-contained, reproducible test case.
Avoid dependencies to Mathlib or Batteries. - Test your test case against the latest nightly release, for example on
https://live.lean-lang.org/#project=lean-nightly
(You can also use the settings there to switch to “Lean nightly”)
Description
The functionLean.Data.Xml.Parser.PubidChar does not consume the space character despite it being clearly stated as an allowed character in the referenced https://www.w3.org/TR/xml/#NT-PubidChar :
[13] | PubidChar | ::= | #x20 \| #xD \| #xA \| [a-zA-Z0-9] \| [-'()+,./:=?;!*#@$_%]
Context
This makes it impossible to parse MusicXml (https://github.com/w3c/musicxml), since a typical !DOCTYPE looks like this:
<!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 4.0 Partwise//EN" "http://www.musicxml.org/dtds/partwise.dtd">
Steps to Reproduce
import Lean.Data.Xml
#eval Lean.Xml.parse "<!DOCTYPE a PUBLIC \"b c\" \"d\"><a></a>"
Expected behavior: ok: <a></a>
Actual behavior: error: offset 21: expected: '"'
Versions
4.27.0-rc1
MacOS 15.7.3
Additional Information
Impact
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working