The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
Please vote in the Forum Structure Poll. Polling will close at 2PM EST on January 21, 2025.
I am finishing my dissertation for my PhD. I am essentially taking the two journals i wrote and had published and putting them in between an intro and a discussion. Now I need to do some formatting etc on the entire thing for consistency sake. the problem is I don't have the original word document for my first paper. I only have a PDF of it.
Is there an easy way to convert the paper into office format?
I am using a mac but I do have windows on this machine
I can access an html version of the paper but if theres an easier way than copying it from there I would appreciate it
No, not really. PDF is a fixed format, sort of like compiled code; there's no easy way to get the source document out of it. You can copy and paste the text from it, of course, but there really aren't any better methods.
I think the full version of Acrobat has some options to save as a Word file, but I'm pretty sure it's garbage. You can Export or Save as Text, depending on whether you have Acrobat or just Acrobat Reader, but these give the same output as if you just did CTRL+A, CTRL+C, CTRL+V. You will have to redo all your formatting, fix the line breaks manually, adjust all the hyphenation, etc.
Maybe Adobe InDesign or Pagemaker, or a similar desktop publishing program, would have more robust options? I'm not sure, never having used them.
If the text is in the pdf (as in, its highlightable) its not too bad, but still a pain. Your best best other than paying ridiculous amounts of money is just copy and paste it into word. If you have Illustrator, you can get it out as well, and a little more consistently, but its still a manual affair.
If its a scanned pdf (the text isnt selectable), no. You can use some sort of OCR program on it, but its going to suck unless you shell out major cash, and its still going to be riddled with errors. Youre better off retyping it.
The company i work for makes its business out of working with documents and we still dont have a decent way to get shit out of PDFs. I fucking hate when clients send PDF files because it means about 100x more effort than if they sent us the original word files.
You say you have the HTML version though? You can open HTML files in word and then resave it back as a .doc im fairly sure. Either way youre going to have 1000000x times better luck with the HTML file, even if youre just going to copy and paste into word, than you will with the PDF. At least you wont lose all your formatting that way.
Posts
Maybe Adobe InDesign or Pagemaker, or a similar desktop publishing program, would have more robust options? I'm not sure, never having used them.
It's basically an online PDF>Word or Word>PDF converter. You upload your file in one of the formats, and it comes out in the other.
Worked pretty good for most (if not all) of my PDF documents.
If its a scanned pdf (the text isnt selectable), no. You can use some sort of OCR program on it, but its going to suck unless you shell out major cash, and its still going to be riddled with errors. Youre better off retyping it.
The company i work for makes its business out of working with documents and we still dont have a decent way to get shit out of PDFs. I fucking hate when clients send PDF files because it means about 100x more effort than if they sent us the original word files.
You say you have the HTML version though? You can open HTML files in word and then resave it back as a .doc im fairly sure. Either way youre going to have 1000000x times better luck with the HTML file, even if youre just going to copy and paste into word, than you will with the PDF. At least you wont lose all your formatting that way.
Check out my band, click the banner.
has worked for me
if you can't highlight (it is a picture pdf) then you need access to either an OCR program or a scanner with an OCR capability
And this pdf file didn't have any images at all.
We use this at work. It rules. They have a demo available, so check it out!