The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
Please vote in the Forum Structure Poll. Polling will close at 2PM EST on January 21, 2025.

Excel Macro or Python or some other programming help

mtsmts Dr. Robot KingRegistered User regular
edited June 2018 in Help / Advice Forum
Not sure if this is possible.

I did a functional test that basically takes a measure every 2ms for 7 minutes and saves this data into essentially a csv file. (it is a different extension but you can open it in excel just like one) I have data files that have something like 211,000 rows in them. I have about 100 of these files for different subjects.

What I would like to do is have a macro that will open the csv and then copy 3 columns in some rows and then paste that into a new workbook, preferably with some way to identify the file it came from and having organization

like say i have 10 files in folder A
It would output something like

Origin File 1 name Row # 1 column A, column B, column C
Origin File 1 name Row # 2 column A, column B, column C
Origin File 2 name Row # 1 column A, column B, column C
Origin File 2 name Row # 2 column A, column B, column C
.
.
.
Origin File 10 name Row # 1 column A, column B, column C
Origin File 10 name Row # 2 column A, column B, column C


I imagine you could do this with a macro but I have no clue how to even go about doing that.

camo_sig.png
mts on

Posts

  • L Ron HowardL Ron Howard The duck MinnesotaRegistered User regular
    edited June 2018
    Since those are CSVs, what I would do is write a program in C# to do just what you need. Very easy to handle the inputs and output, and you can view the final file in Excel if you so chose.


    Edit:
    I mean, it can be in Java or whatever other language you're familiar with. I'm thinking that if you're writing cool Excel macros, I'm sure you're knowledgeable enough to use a good programming language and slap something together in a little bit of time!

    L Ron Howard on
  • mtsmts Dr. Robot King Registered User regular
    edited June 2018

    Edit:
    I mean, it can be in Java or whatever other language you're familiar with. I'm thinking that if you're writing cool Excel macros, I'm sure you're knowledgeable enough to use a good programming language and slap something together in a little bit of time!

    HA, I am not knowledgeable in anything. Not a programmer, I just know to ask someone else to do it

    mts on
    camo_sig.png
  • DaenrisDaenris Registered User regular
    This would definitely be pretty simple in something like R or Python (or lots of other languages), but it's not something I'd try doing in Excel. That said, the example provided here should at least get you moving in the general direction to do it in Excel https://www.mrexcel.com/forum/excel-questions/579884-macro-code-open-files-copy-data-into-active-workbook.html

  • Edith_Bagot-DixEdith_Bagot-Dix Registered User regular
    I'm a quant finance guy so I am likely biased toward doing this in Python, but I would look into doing something like this:
    1. Install Python, numpy, pandas, and set up Jupyter notebooks if you haven't already
    2. Iterate over the files and load them each into a dataframe (essentially an in-memory representation of the table). There is a single function in pandas (read_csv) that does this. Store these dataframes in a dictionary where the key is the name of the source file.
    3. Perform whatever calculations you need to.
    4. Write the dataframes out to new files (using the write_csv function) and name the new files based on the keys of the dictionary.

    That's probably a bit vague but hopefully enough to get you started.



    Also on Steam and PSN: twobadcats
  • SerpentSerpent Sometimes Vancouver, BC, sometimes Brisbane, QLDRegistered User regular
    edited June 2018
    Faster than R, python, etc, would just be to use excel's built in Power Query gui functionality for this. Probably takes you 5 minutes.

    When I discovered power query back in 2015 or so I cried a little as I watched the last 5 years of my life be wasted doing this type of task in Java, C#, R, Python...

    just so much faster. and easier. and more repeatable. etc. etc.

    Serpent on
  • evilmrhenryevilmrhenry Registered User regular
    I'm a quant finance guy so I am likely biased toward doing this in Python, but I would look into doing something like this:
    1. Install Python, numpy, pandas, and set up Jupyter notebooks if you haven't already
    2. Iterate over the files and load them each into a dataframe (essentially an in-memory representation of the table). There is a single function in pandas (read_csv) that does this. Store these dataframes in a dictionary where the key is the name of the source file.
    3. Perform whatever calculations you need to.
    4. Write the dataframes out to new files (using the write_csv function) and name the new files based on the keys of the dictionary.

    That's probably a bit vague but hopefully enough to get you started.

    I wouldn't really recommend pandas to a beginner. The csv module should be perfectly fine for this kind of manipulation. Pandas is more for big data.

    Be warned that Excel has a roughly 1 million row limit. If you want to combine all the files, the resulting file might not be openable in Excel.

  • ArbitraryDescriptorArbitraryDescriptor Registered User regular
    I'm a quant finance guy so I am likely biased toward doing this in Python, but I would look into doing something like this:
    1. Install Python, numpy, pandas, and set up Jupyter notebooks if you haven't already
    2. Iterate over the files and load them each into a dataframe (essentially an in-memory representation of the table). There is a single function in pandas (read_csv) that does this. Store these dataframes in a dictionary where the key is the name of the source file.
    3. Perform whatever calculations you need to.
    4. Write the dataframes out to new files (using the write_csv function) and name the new files based on the keys of the dictionary.

    That's probably a bit vague but hopefully enough to get you started.

    I wouldn't really recommend pandas to a beginner. The csv module should be perfectly fine for this kind of manipulation. Pandas is more for big data.

    Be warned that Excel has a roughly 1 million row limit. If you want to combine all the files, the resulting file might not be openable in Excel.
    +1 for python.

    I'm not certain, but I think Excel may open files with more than 1m rows if you have the ram: but you will lose those records.

    We noted that some huge files were getting smaller after a small header tweak in excel, and believe this was what was happening, but moved on because, fuck it, python.

    But here is a "No, fuck you: Power Query/Pivot!" solution
    A common myth I hear very frequently is that you can’t work with more than 1 million records in Excel. Actually, the right myth should be that you can’t use more than 1,048,576 rows, since this is the number of rows on each sheet; but even this one is false.

    In this post I’ll debunk this myth by creating a PivotTable from 50 million records in Excel.

    To make things more interesting, I’ll import data from 20 different text files (.csv) with 2.5 million records each.

    https://www.masterdataanalysis.com/ms-excel/analyzing-50-million-records-excel/

  • mtsmts Dr. Robot King Registered User regular
    edited July 2018
    I wasnt planning on taking all the rows. just select ones from each file.

    I am working with output files for a stimulator/force transducer. For our fatigue protocol it is a 7 minute long course of stimulations every 4 secs. But it also takes a measure every 2 ms.

    I wanted to grab the measures at each minute ( 0, 1,2,... 7) to make it easier to work with.
    I ended up just brute forcing it using the analysis software since it wasn't that bad since it was able to do roughly 30 files at a time.

    mts on
    camo_sig.png
Sign In or Register to comment.