-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershacktoberfest 🍁https://hacktoberfest.digitalocean.com/https://hacktoberfest.digitalocean.com/help wantedExtra attention is neededExtra attention is needed
Description
As part of the corpus creation process, the PDF content should be converted to text, and aggregated together into a large dataset.
This dataset should be stored into the data/papers/processed folder, and the script that creates it should be saved under src/papers/data/make_dataset.py file.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershacktoberfest 🍁https://hacktoberfest.digitalocean.com/https://hacktoberfest.digitalocean.com/help wantedExtra attention is neededExtra attention is needed