Bibliography extraction from pasted text
Extract Citations From Text
Upload a Word document or paste a bibliography, DOCX copy, Google Docs reference list, RIS block, BibTeX file, or messy manuscript text. LumaCite structures the citations, detects DOI and PMID signals, checks metadata evidence, and prepares clean exports.
Upload Word, TXT, RTF, RIS, BibTeX, or CSV
Drop a `.docx`, `.txt`, `.rtf`, `.ris`, `.bib`, `.enw`, `.md`, or `.csv` file here. Word files are converted to plain text in your browser, then parsed with the same citation extraction engine.
No file uploaded yet. You can also paste references below.
Ready for pasted references, DOCX copy, Google Docs copy, RIS, BibTeX, or sample data.
Extraction report
Structured citation results
Outcome quality
Waiting for extraction
Paste references to see split quality, identifier coverage, duplicate detection, source checks, and export readiness.
Reference quality report
Possible problems
Metadata sources checked
Text parser checks
Extracted citations
Preview export files
Text citation extraction for cleaner reference workflows
Turn pasted bibliography text into structured, export-ready citations.
LumaCite is built for the common moment when references are trapped in a Word bibliography, Google Docs copy, manuscript draft, web page, email, RIS block, or BibTeX snippet. Paste the text, split the citations, detect scholarly identifiers, review missing metadata, and export clean records for Zotero, EndNote, Mendeley, Overleaf, Word, Google Docs, spreadsheets, or manuscript QA.
Why pasted text is often easier than PDF extraction
Word and Google Docs copy usually preserves reading order better than PDF text. That gives the parser cleaner entry boundaries, fewer column-order errors, and clearer punctuation for authors, years, titles, journal names, volume, issue, and pages.
How the review report helps
LumaCite reports identifier coverage, duplicate risk, missing years, weak title detection, source checks, and row-level confidence. You can edit any citation before exporting instead of importing broken records into a reference manager.
Where the engine reuses PDF extraction lessons
The text engine borrows the same identifier, enrichment, source audit, duplicate, and export logic used for PDF references, but removes PDF-specific layout hazards so pasted bibliography cleanup can stay fast.
Fast answers
About extracting citations from text
Can I paste references from Word or Google Docs?
Yes. Copy the bibliography or reference section from Word, DOCX text, Google Docs, or a manuscript and paste it into the extractor.
Can this parse RIS and BibTeX?
Yes. The parser detects common RIS tags and BibTeX entries, then converts them into the same editable citation review table.
Can I extract DOI, PMID, ISBN, and arXiv IDs from pasted references?
Yes. Each citation row is scanned for DOI, PMID, PMCID, arXiv, ISBN, ISSN, and URL values before metadata enrichment.
Can I export extracted citations to citation managers?
Yes. Download BibTeX, RIS, CSL-JSON, CSV, Markdown, or an audit report for Zotero, Mendeley, EndNote, Overleaf, Word, Google Docs, and spreadsheet workflows.
What should I do with low-confidence rows?
Open the row, edit the title, authors, year, source, DOI, or URL, then run metadata enrichment again. The score updates after edits.