Hi, i'm facing regulars expressions for the first time and i need to extract some data from this report (a txt file with formatting info):
\n10: Vikelis M, Rapoport AM. Role of antiepileptic drugs as preventive agents for \nmigraine. CNS Drugs. 2010 Jan 1;24(1):21-33. doi:\n10.2165/11310970-000000000-00000. Review. PubMed PMID: 20030417.\n\n\n21: Johannessen Landmark C, Larsson PG, Rytter E, Johannessen SI. Antiepileptic\ndrugs in epilepsy and other disorders--a population-based study of prescriptions.\nEpilepsy Res. 2009 Nov;87(1):31-9. Epub 2009 Aug 13. PubMed PMID: 19679449.\n\n\n
As you can see all the txt's records begins with a number like "xx:" and always ends with "PubMed PMID: dddddddd. but using a RegEx like this:
regex = re.compile(r"^\d+: .+ PMID: \d{8}.$")
regex.findall(inputfile)
Gives me a list with one big string, so i'm misunderstanding something. How can i extract data from these records?