As was foretold, we've added advertisements to the forums! If you have questions, or if you encounter any bugs, please visit this thread: https://forums.penny-arcade.com/discussion/240191/forum-advertisement-faq-and-reports-thread/

Windows folder merge or sort help

28682868 Registered User regular
edited September 2010 in Help / Advice Forum
EDIT: Collate? Is collate the word?

This is a rough question, one I'm convinced I could google fu if only I could articulate it better. But here goes:

Is there an app that will assist me in collating, merging, sorting, or managing folders? For instance: I have a folder full of TIFFs

Example:

/folder1
1111111.tiff
1111112.tiff
1111113.tiff
1111114.tiff
1111115.tiff
1111116.tiff
1111117.tiff

(imagine there are 35000 of these files)

Let's say I have a file I need inserted between each tiff so that my scanning software will recognize these as multiple documents, rather than one large document.

Example:

Separator sheet.pdf

Now I need to somehow insert a copy of Separator sheet.pdf between each TIFF and export this to a new folder

So that the end result looks like this example:

/Folder2
1111111.tiff
Separator sheet.pdf (copy 1)
1111112.tiff
Separator sheet.pdf (copy 2)
1111113.tiff
Separator sheet.pdf (copy 3)
1111114.tiff
Separator sheet.pdf (copy 4)
1111115.tiff
Separator sheet.pdf (copy 5)
1111116.tiff
Separator sheet.pdf (copy 6)
1111117.tiff


Is there a way to do this? My scanning software can do it with PDFs but cannot with TIFFs, for some reason. We're upgrading that software to a newer release this week but this project can't wait until then. I'd hate to have to hire someone to write an app that can do this...

Also if you are a scanning/windows/file management sort of person with some programming knowledge and are maybe looking for a job in the Austin Area send me a PM.

Sorry this is unclear, but it's a question I don't exactly know how to ask.

Warhams. Allatime warhams.

buy warhams
2868 on

Posts

  • fadingathedgesfadingathedges Registered User regular
    edited September 2010
    This may be really intensive on your HDD space depending on the size of these files, but would this work?

    1) select all files

    2) copy

    3) pasta.


    So instead of
    1.tif
    2.tif
    3.tif

    you have

    1.tif
    1(1).tif
    2.tif
    2(1).tif
    3.tif
    3(1).tif
    etc.


    If it's the sequential numeric file names that your scanner has an issue with, maybe this will help.

    fadingathedges on
  • fadingathedgesfadingathedges Registered User regular
    edited September 2010
    Another option, if you are using photoshop:

    1) Record an action

    2) Open a file in the folder

    2a) Do an image size or some other steps to shrink the file if it's large.

    3) Save as a PDF in the same folder.

    4) Close the file.

    5) Stop recording the action.

    6) Apply the action to the folder in question.

    This might take a long time if you have 35,000 files, especially if they are big.


    I would recommend doing anything you try on a small dummy folder first with just a couple files in it, of course.


    edit: Collate is to put things in order. I think what you want to do is interleave.

    fadingathedges on
  • embrikembrik Registered User regular
    edited September 2010
    Well, when I have a need to rename/copy/move in bulk inside Windows, I use the BRU (Bulk Rename Utility). Not sure if it will be able to do what you need, but it's free - check it out here - http://www.bulkrenameutility.co.uk/Main_Intro.php

    embrik on
    "Damn you and your Daily Doubles, you brigand!"

    I don't believe it - I'm on my THIRD PS3, and my FIRST XBOX360. What the heck?
  • 28682868 Registered User regular
    edited September 2010
    I did a pisspoor job of asking this question, basically I have 4,000 unique tiff files (each file is a multipage document) and I need to insert a file called seperatorsheet.pdf between each file. Basically I need a program that will add this sheet, (or a copy of this sheet, creating this copy if necessary) between each file.

    The folder will then be processed by our scanning/indexing software, without separator sheets the scanning/indexing software will see the 4k documents as a single document.

    We typically only work with PDFs, and we have an application to handle our pdfs in this manner, but we've got nothing for tiffs.

    2868 on
    Warhams. Allatime warhams.

    buy warhams
  • TelMarineTelMarine Registered User regular
    edited September 2010
    If that bulk renamer allows changing file extensions, couldn't you just temporarily rename all the .tiffs to .pdf, sort, then use bulk renamer to change back to .tiff?

    TelMarine on
    3ds: 4983-4935-4575
  • DjeetDjeet Registered User regular
    edited September 2010
    Does the content in the seperatorsheet.pdf matter (is it also being scanned in) or is it just being used to make the scanning software introduce breaks between the tiffs when scanning?


    As a one-off hack you could copy all the tiffs into /Folder2, so /Folder2 looks just like /folder1.

    Then in a command prompt window, change the working directory to /Folder2, and enter the following command "ren *.tiff *.pdf" (without quotes). All the files will then have the same filenames, but with pdf file extensions. Then move (cut and paste) the tiffs in /folder1 to /Folder2, so /folder1 is empty and /Folder2 looks like this when sorted alphabetically:


    /Folder2
    11111.pdf
    11111.tiff
    11112.pdf
    11112.tiff
    11113.pdf
    11113.tiff
    11114.pdf
    11114.tiff
    etc.

    Djeet on
  • GoofballGoofball Registered User regular
    edited September 2010
    Just copy the separator sheet pdf as 1111111-1.pdf, 1111111-2.pdf, etc. Windows default ordering in sort by name mode will put the PDF file after each tiff. IE:
    1111111.tiff
    1111111-1.pdf
    1111112.tiff
    1111112-1.pdf
    1111113.tiff
    1111113-1.pdf
    

    I was bored, I built a batch file to copy the separatorsheet.pdf file as a new file named correctly like so:
    @ECHO OFF
    CLS
    
    :SETSRCFILE
    REM Set source file name for one to many copy
    SET /P FileToCopy=Name of file to copy (Default: separatorsheet.pdf): 
    IF /I "%FileToCopy%"=="" SET FileToCopy=separatorsheet.pdf
    
    IF NOT EXIST %FileToCopy% (
         CLS
         ECHO.
         ECHO. Can't find %FileToCopy%. Please reenter.
         ECHO.
         PAUSE
         SET FileToCopy=
         CLS
         GOTO :SETSRCFILE
    )
    
    REM Set output file name mods for one to many copy
    SET /P OutFileNameExt=Text and extension appended to copied filename (Default: -1.pdf): 
    IF /I "%OutFileNameExt%"=="" SET OutFileNameExt=-1.pdf
    
    REM Set file type extension to create names from:
    SET /P Extension=File type extension of name source files (Default: tif): 
    IF /I "%Extension%"=="" SET Extension=tif
    
    ECHO.
    ECHO Script will create the following named copies of %FileToCopy%:
    ECHO.
    REM FOR loop to read each newline DIR /B *.%Extension% and output the results of the proposed copy operation
    REM FileName on each line in destfiles minus extensions + OutFileNameExt
    FOR /F "DELIMS=" %%A IN ('DIR /B *.%Extension%') DO ECHO %%~nA%OutFileNameExt%
    
    ECHO.
    ECHO Type YES and hit enter to continue if the above looks correct
    ECHO Any other key and enter to exit
    ECHO.
    SET /P FOO=Continue and actually copy? 
    IF /I "%FOO%"=="" GOTO :CLEANUP
    IF /I "%FOO%"=="YES" GOTO :DOCOPY
    GOTO :CLEANUP
    
    :DOCOPY
    ECHO.
    REM FOR loop to read each newline in DIR /B *.%Extension% and copy FileToCopy to
    REM FileName on each line in destfiles minus extensions + OutFileNameExt
    FOR /F "DELIMS=" %%A IN ('DIR /B *.%Extension%') DO COPY "%FileToCopy%" "%%~nA%OutFileNameExt%"
    ECHO.
    
    :CLEANUP
    SET FileToCopy=
    SET OutFileNameExt=
    SET Extension=
    SET FOO=
    
    PAUSE
    
    

    Copy everything in the CODE block and save it as a .cmd file in the directory with the TIFF files. IE: copyonetomany.cmd and double click to start.

    Edit: little code cleanup in the batch script. Would have blown up if file names had spaces in them... This expects to be in the same directory with the files that need to be emulated. The separator page being copied should be in the same location as well.

    Edit Edit: Also note the all of the above assumes that the "separator page" is the same file copied many times over to a unique name. If you are trying to take two directories and merge them together many to many you would need to use something like the bulk renamer linked earlier or something like Ant Renamer: http://www.antp.be/software/renamer

    Super Extra Important Warning Edit:

    WORK FROM A COPY AND MAKE SURE YOU HAVE A BACKUP. Don't run this against your only copy of the data on the off chance something goes all wobbly.

    Goofball on
    Twitter: @TheGoofball
  • 28682868 Registered User regular
    edited September 2010
    Thanks. I will do this, and yes we work from back ups.

    To answer a few questions. The content in the separator sheet is important, the scanning/indexing software has an OCR that recognizes this sheet and splits when it encounters it.

    The filenames are more varied than 11111a, or whatever example I provided, so simply renaming the separator sheets to alternate will not work. (These files are contracts, thousands of contracts, with different dates and numbers to ID them.)

    I'll test tomorrow, but Goofball, will this script/code/whatchamoo work in the scenario I described?

    (4k files with varying names, basically I need a separator sheet every other file.)

    Also H/A, if you can help with this and live in or near Austin I shit you not PM me if you are out of work or want a job, maybe we need you.

    2868 on
    Warhams. Allatime warhams.

    buy warhams
  • GoofballGoofball Registered User regular
    edited September 2010
    It should work just fine even with non sequential and wildly different file names as long as the extensions are all the same or you run it multiple times specifying for each file extension.

    All it does is perform a directory list of all the files matching the specified extension in the directory that it is run from and then copy whatever was set as the separator file as the original file name plus whatever extra text plus extension was specified.

    You should probably test this to make sure it works on a small subset copy of your files to start with before trying against the whole shebang.

    Example of what it does in testing on my PC:

    List of files originally in directory:
    1111111a.tiff
    copyonetomany.cmd
    Help and Advice SuperSekretContract.tiff
    LittleTimmy 24-7 Contract Revision B.tiff
    seperatorsheet.pdf
    Wang Industries Data 3355.tiff
    Wang Industries Data 3355.wang.penny.arcade.tiff
    

    Script Runtime Options:
    Name of file to copy: separatorsheet.pdf
    Text and extension for destination file: !SEPTEXT.EXTENSION
    File type extension of existing files: tiff
    

    Finished directory file structure (all !SEPTEXT.EXTENSION files are an exact copy of the original separatorsheet.pdf with new filename):
    1111111a!SEPTEXT.EXTENSION
    1111111a.tiff
    copyonetomany.cmd
    Help and Advice SuperSekretContract!SEPTEXT.EXTENSION
    Help and Advice SuperSekretContract.tiff
    LittleTimmy 24-7 Contract Revision B!SEPTEXT.EXTENSION
    LittleTimmy 24-7 Contract Revision B.tiff
    seperatorsheet.pdf
    Wang Industries Data 3355!SEPTEXT.EXTENSION
    Wang Industries Data 3355.tiff
    Wang Industries Data 3355.wang.penny.arcade!SEPTEXT.EXTENSION
    Wang Industries Data 3355.wang.penny.arcade.tiff
    

    At that point you could move "copyonetomany.cmd" and "separatorsheet.pdf" out of the directory and be left with hopefully what your scanner app needs.

    You may have to play around with the "Text and extension for destination file" to figure out what needs to be added to the copied separator files, if anything, other than the extension (.pdf) so that whatever software batch processes these files sees the original and seperator files in the correct order. That setting can be basically anything you want and will be appended to the end of the original file name IE: Wang Industries Data 3355.tiff becomes Wang Industries Data 3355 when the script processes and then becomes Wang Industries Data 3355+OutPutTextExt when it copies the original separator file.

    Also, thinking about it after you mentioned file names the previous batch script I hacked together had issues with files with multiple periods in the file name. The script was set up to assume that the only period it will encounter is the one before the file extension. It then cut the first period it found in each file name and everything after it off.

    I've edited my original post and updated the script to fix the period problem and a couple of other small issues.

    Goofball on
    Twitter: @TheGoofball
  • 28682868 Registered User regular
    edited September 2010
    Shit. You certainly went above and beyond...I was expected to be told about a utility not have one created for me.

    I'll test this today, and thanks for the effort, whether it works or not.

    2868 on
    Warhams. Allatime warhams.

    buy warhams
  • 28682868 Registered User regular
    edited September 2010
    Well, it didn't work. It looks like it was about to work, then it doesn't generate the files.

    I'm still working the kajigger, (first problem was the file extension is .tif, not .tiff).

    Basically I type yes then hit enter, the terminal window closes and my folder is unchanged no copies are generated.

    I'm going to mess around with it a bit more.

    2868 on
    Warhams. Allatime warhams.

    buy warhams
  • GoofballGoofball Registered User regular
    edited September 2010
    The first series of 3 prompts it gives you allow you to change the defaults if needed. So when it says "File type extension of name source files (Default: tiff): " you can type "tif" without quotes and hit enter to change it for that run from the default I set of "tiff". Same thing if you need to change the seperatorpage.pdf file to something else or the text and extension added to the copied file names.

    You should get a big list of files it will output before you have to type "YES" and hit enter. If you aren't seeing anything listed before it gets to that point then the file extension isn't matching anything.

    If you want to change the default file type to "tif" from "tiff" so it always looks for that do the following:

    Change this part:
    REM Set file type extension to create names from:
    SET /P Extension=File type extension of name source files (Default: tiff):
    IF /I "%Extension%"=="" SET Extension=tiff 
    

    To this:
    REM Set file type extension to create names from:
    SET /P Extension=File type extension of name source files (Default: tif):
    IF /I "%Extension%"=="" SET Extension=tif
    

    Not a big deal, like I said yesterday I was bored and batch scripting exercises the old brain cells a bit.

    Goofball on
    Twitter: @TheGoofball
  • OrogogusOrogogus San DiegoRegistered User regular
    edited September 2010
    Here's an ugly, hacky way to do it with Excel.

    You need to be able to open a command prompt window to the folder all the files are in. Instructions for doing this vary by the version of Windows you're using.

    1. Open a command window, type "dir/b > filelist.txt". This will do a barebones directory list and put it in filelist.txt, in the same folder.
    2. Drag filelist.txt into Excel. Just use the default options for text conversion (i.e., hit Finish at the first screen)
    3. Copy separator sheet.pdf or whatever into the folder.
    4. In Excel, move the contents of column A (the file list) into column C
    5. In cell A1, type copy and paste this through however long column C goes on for. (Keyboard shortcuts: Start from the bottom by going to column C, press CTRL+down arrow, then move over to column A. Type copy "Separator sheet.pdf" in cell A31945, CTRL+C to copy that cell, then press CTRL+SHIFT+up arrow to highlight going up to the top, and CTRL+V to paste in all the cells).
    6. Highlight column C, then do a search and replace on .tiff to change it into something like " - interstice.pdf". You might have to jump through more complicated hoops if your filenames are particularly awkward (e.g., if you have both 1013103.tiff and 10103103 - aardvark.tiff, then 1013103 - interstice.pdf won't come between them alphabetically and you'll have to use more spaces or something).
    7. Column B and column D : Copy a quotation mark " down both columns. You need the quotation marks because the copy command from the command prompt doesn't deal well with spaces in filenames unless you have the quotes.
    8. Save the file as FileRename.bat.
    9. Because Excel is going to mess up the quotation marks (it's probably going to double or quadruple them, as well as put in some tab spacing), open FileRename.bat in Notepad. Do the global search and replaces necessary to change

    "copy ""Separator sheet.pdf""" """" 1 - interstice.pdf """"

    into

    copy "Separator sheet.pdf" "1 - interstice.pdf"

    This command is going to copy the separator sheet file into the second filename, and hopefully there should be several thousand lines like this.

    10. From the command prompt, type FileRename to run the batch file. You can double click on it from the window, too, but the command prompt will let you see any errors before the window closes.

    EDIT: This probably goes without saying, but make sure to sort the window by filename after all this is done.

    Orogogus on
  • 28682868 Registered User regular
    edited September 2010
    Goofball wrote: »
    The first series of 3 prompts it gives you allow you to change the defaults if needed. So when it says "File type extension of name source files (Default: tiff): " you can type "tif" without quotes and hit enter to change it for that run from the default I set of "tiff". Same thing if you need to change the seperatorpage.pdf file to something else or the text and extension added to the copied file names.

    You should get a big list of files it will output before you have to type "YES" and hit enter. If you aren't seeing anything listed before it gets to that point then the file extension isn't matching anything.

    If you want to change the default file type to "tif" from "tiff" so it always looks for that do the following:

    Change this part:
    REM Set file type extension to create names from:
    SET /P Extension=File type extension of name source files (Default: tiff):
    IF /I "%Extension%"=="" SET Extension=tiff 
    

    To this:
    REM Set file type extension to create names from:
    SET /P Extension=File type extension of name source files (Default: tif):
    IF /I "%Extension%"=="" SET Extension=tif
    

    Not a big deal, like I said yesterday I was bored and batch scripting exercises the old brain cells a bit.

    I figured out how to switch it to .tif.

    When I run the program everything seems to be working, I get a message with what the new files will be called and it looks correct, I get the output, it says type yes, and I do, and the program just quits. I guess I can remove @echo off and see where the failure is happening...

    2868 on
    Warhams. Allatime warhams.

    buy warhams
  • GoofballGoofball Registered User regular
    edited September 2010
    I just checked it again. If it's getting to that point and not doing anything then it probably can't find the "seperatorsheet.pdf" (note the misspelling there... oops) to copy. You can add the word PAUSE on a new at the end of the script to make it wait for a keypress before exiting so you can see what errors/messages are on the screen for the actual copy operation.

    EDIT: I updated the script to fix my spelling mistake on the default source filename, include some error checking on the source file to make sure it exists and made it pause at the end just in case.

    Goofball on
    Twitter: @TheGoofball
  • 28682868 Registered User regular
    edited September 2010
    Aha! Spelling error. I fixed it as well as made instructions more clear. You win an internet good sir.

    2868 on
    Warhams. Allatime warhams.

    buy warhams
  • GoofballGoofball Registered User regular
    edited September 2010
    2868 wrote: »
    Aha! Spelling error. I fixed it as well as made instructions more clear. You win an internet good sir.

    I'm assuming the limed part means it worked for you. Now just wait until you get my bill :P

    Goofball on
    Twitter: @TheGoofball
Sign In or Register to comment.