The unicode normalization step of the python interpreter can be abused

Basically the suggesion in [this reddit comment](https://www.reddit.com/r/Python/comments/rra22x/comment/hqfcqbr/?utm_source=share&utm_medium=web2x&context=3)


From [this article](https://www.asmeurer.com/python-unicode-variable-names/):
> Python always applies [NFKC](https://en.wikipedia.org/wiki/Unicode_equivalence#Normalization)
normalization to characters. Therefore, two distinct characters may actually
produce the same variable name. For example:
> ```python
> >>> ª = 1 # FEMININE ORDINAL INDICATOR
> >>> a # LATIN SMALL LETTER A (i.e., ASCII lowercase 'a')
> 1
> ```

I've generated a mapping of these characters taken from [this url](https://appcheck-ng.com/wp-content/uploads/unicode_normalization.html).\
The mapping can be found [here](https://gist.githubusercontent.com/wasi-master/c68a8065fd2234b196cfe2a8c1723afc/raw/96736fb17b648e4acacd7dffe8be46cf6337b639/nkfc.json). But beware that some characters may not be supported in python because I haven't tested every one of them.

I suggest adding another additional flag to enable this behaviour

I would have done it myself and opened a pr but I am too busy at the moment


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The unicode normalization step of the python interpreter can be abused #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The unicode normalization step of the python interpreter can be abused #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions