So I'm wondering if someone knows how to form an Excel macro that will insert a specific character combination (#n) after a certain number of letters (43) in a cell. The catch is, it can't dissect a word that's already there, so if character 43 isn't a space it has to backtrack to find the nearest space.
Basically, the #n is creating a carriage return in the text, dividing it into 3 separate lines (for a Japanese game programmer, I dunno if that's common). I can go through ~20k lines and do it myself, but it's gonna take balls long, and I was hoping there might be a faster way. Any help at all would be appreciated.
Unfortunately, I don't think the "if in the middle of a word, backtrack to the previous space" bit is possible with Excel macros.
Actually, the way the OP wrote it, it's just a matter of finding the "last space" in a set of 43 characters. The logic is start at the last character and worked backwards until you find the space, then insert the #n.
I assume you have a column of cells containing 43 characters each? If so, the logic would follow:
(Pseudo-code)
length = length of string (43)
for (int i = length; i > 0; i--)
if InStr at i returns > 0 when searching for a space, a space was found at position i in the string (break loop, keep value of i)
(otherwise keep looping)
The value of i gives us the position of the last space, so then we can break the string at i and insert our #n
If you're not a programmer and don't know how a for loop works let me know. I thought about writing a macro for you but I'm too out of touch with VBA :P
So I'm wondering if someone knows how to form an Excel macro that will insert a specific character combination (#n) after a certain number of letters (43) in a cell. The catch is, it can't dissect a word that's already there, so if character 43 isn't a space it has to backtrack to find the nearest space.
It sounds a bit to me like what you have is not really an Excel-file that needs to be handled but perhaps simply a list which needs to broken down into a list with items separated by line changes?
Is the file actually in Excel format or is it just really a file that for some reasons is opened by Excel. If it's the later you might have an easier time opening in something like Textpad (Advanced shareware text editor) and then solve you need by using it's advance search&replace function.
Will it always have at least one space in the first 43 characters?
If you're happy to accept cell formulas and can guarantee that there's at least one space in the first 43 characters, then I think it might be possible by doing some trickery using stuff from here and a few formulas in adjacent cells.
In fact, I've whipped a quick demo up with the formulas here.
Unfortunately, I don't think the "if in the middle of a word, backtrack to the previous space" bit is possible with Excel macros.
Actually, the way the OP wrote it, it's just a matter of finding the "last space" in a set of 43 characters. The logic is start at the last character and worked backwards until you find the space, then insert the #n.
I assume you have a column of cells containing 43 characters each? If so, the logic would follow:
(Pseudo-code)
length = length of string (43)
for (int i = length; i > 0; i--)
if InStr at i returns > 0 when searching for a space, a space was found at position i in the string (break loop, keep value of i)
(otherwise keep looping)
The value of i gives us the position of the last space, so then we can break the string at i and insert our #n
If you're not a programmer and don't know how a for loop works let me know. I thought about writing a macro for you but I'm too out of touch with VBA :P
Your technique is correct but implementation depends on how his data is arranged in the spreadsheet.
OP, you need to explain more about the structure of the data in your Excel file. Is it 20,000 rows of text in column A? Or is all of the text in cell A1? Or what's the situation there?
If you can tell me how the text is laid out I can whip this up for you super quick.
Actually, fuck it. Here's VBA code that should probably run (on a mac without office, too lazy to google all of my syntax, pretty sure it's good though just tell me if you get errors and on which line and I'll figure it out as we go).
This is what you'd need to do to insert as many #n's as necessary for a string of whatever length is encountered.
The only substitutions you need to make are on the line that starts out with apostrophe. Remove the apostrophe and set the variable there to be equal to whatever it needs to equal. If the text is in cell A1, the easiest way to reference that is "activesheet.range("A1").value", but if you have a whole column full of cells that have text in them of varying lengths, we need to wrap a loop around the entire snippet below.
If that is the case, let me know which column the data will be in and I'll do the rest.
Dim length as long, str1 as string, i as long, anchor as long, limit as long
'str1 = original text
length = len(str1)
i = 1
anchor = 1
limit = 43
do while (i + 43 < length)
do while (i < limit)
if mid(str1,i,1) = " " then
anchor = i
end if
i = i + 1
loop
if mid(str1,i,1) = " " then
str1 = left(str1,i) & "#n" & right(str1,len(str1)-i)
limit = i + 43
anchor = i
else
str1 = left(str1,anchor) & "#n" & right(str1,len(str1)-anchor)
limit = anchor + 43
i = anchor
end if
loop
So what we do here is loop until the current position in the text exceeds the length of the text.
Then we enter into a loop that we follow until we hit the predefined limit (43 in this case)
While in the inner loop, we look for space characters and each time we find one, we place an anchor on it.
After we've run up to the limit of 43 characters we check whether the character we're on is a space character or not. If it is, then we insert the #n where we are. If it's not, we insert the #n back where our anchor is (the last space we'd encountered).
Then we take where ever we put our #n, and count 43 characters out from there and define that as our next limit, and repeat the inner loop and subsequent #n insertion check.
We keep doing these things until we've gotten to the point where 43 characters further would place us beyond the end of the text, and at that point skip out of the loops.
Pheezer on
IT'S GOT ME REACHING IN MY POCKET IT'S GOT ME FORKING OVER CASH
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
Also note that if you work for a dev (and we both know you do) you probably have access to someone who could whip up this same code, using what I gave as a model to save time explaining yourself, using a far quicker option.
If you were to do this with a text file and write the code in a real programming language rather than an embedded macro language you would find that the code executes about a million times faster. I work with a lot of VBA code written to automate spreadsheet tasks and to assemble reports and things and stuff like the project you're looking at tends to run very slowly in VBA.
Of course, if it takes half an hour to run in VBA and would only take 3 minutes to run in C++, either are pretty shit hot options to spending 3 weeks doing it by hand.
Pheezer on
IT'S GOT ME REACHING IN MY POCKET IT'S GOT ME FORKING OVER CASH
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
This is a more "appropriate" method in my opinion. I just wrote it IN Excel to make sure it works, and it does.
This performs on the active cell (i.e. wherever your cursor is) but it can be automated to loop through an entire column if that's better.
Sub Insert_Stuff_At_43rd_Character()
Dim str_CellString As String
Dim rng_Cell As Range
Dim iChar As Integer
Const str_TextToInsert As String = "MOOGLYBOOGLY"
Set rng_Cell = ActiveCell
str_CellString = rng_Cell.Value
If Len(str_CellString) < 43 Then
rng_Cell.Value = str_CellString & " " & str_TextToInsert
Exit Sub
End If
If Mid(str_CellString, 43, 1) = " " Then
rng_Cell.Value = Left(str_CellString, 43) & str_TextToInsert & " " & Right(str_CellString, Len(str_CellString) - 43)
Exit Sub
End If
If Mid(str_CellString, 43, 1) <> " " Then
iChar = InStrRev(str_CellString, " ", 43)
rng_Cell.Value = Left(str_CellString, iChar) & str_TextToInsert & " " & Right(str_CellString, Len(str_CellString) - iChar)
Exit Sub
End If
End Sub
Personally, I prefer to nest my IF statements. And the "Exit Sub" lines are wholly unnecessary. I wrote it this way so it's easier for you to read and process. A more concise method would be:
Sub Insert_Stuff_At_43rd_Character_Concise()
Dim str_CellString As String
Dim rng_Cell As Range
Dim iChar As Integer
Const str_TextToInsert As String = "MOOGLYBOOGLY"
Set rng_Cell = ActiveCell
str_CellString = rng_Cell.Value
If Len(str_CellString) < 43 Then
rng_Cell.Value = str_CellString & " " & str_TextToInsert
Else
If Mid(str_CellString, 43, 1) = " " Then
rng_Cell.Value = Left(str_CellString, 43) & str_TextToInsert & " " & Right(str_CellString, Len(str_CellString) - 43)
Else
If Mid(str_CellString, 43, 1) <> " " Then
iChar = InStrRev(str_CellString, " ", 43)
rng_Cell.Value = Left(str_CellString, iChar) & str_TextToInsert & " " & Right(str_CellString, Len(str_CellString) - iChar)
End If
End If
End If
End Sub
But they both do the same thing.
And of course you should change "MOOGLYBOOGLY" to whatever text you want to insert. The code above also assumes you want a space before and after the inserted text.
The process is basically to look at the value of the active cell and do the following:
First, determine if the string is less than 43 characters long. If so, add a space, append the text to the cell, and exit the macro.
Second, determine if the 43rd character is a space. If so, append the text at the 43rd character in the cell, add a space, and exit the macro.
Third, verify that the 43rd character is not a space (not entirely necessary at this point but it's good practice to check anyway). If so, use the InStrRev function to find the most recent previous space (starting at the 43rd character and looking backward), append the text at this location, add a space, and exit the macro.
Sorry, I just noticed you want a carriage return. That's easy enough. Is #n something that you are using in another interpreter? The carriage return in Excel is vbCr or Chr(13). If the former, just change the constant TextToInsert to vbCr and change the & " " & to &. If the latter, just change it to #n and change the & " " & to &.
Sub Insert_Stuff_At_43rd_Character_Loop()
Dim str_CellString As String
Dim iChar, iRow, iLastRow As Integer
Const str_TextToInsert As String = "#n" 'this is the text that will be inserted at the 43rd character
Const iColumn As Integer = 1 'this is the column in which your text resides, change it as necessary
iLastRow = Range("A65536").End(xlUp).Row
For iRow = 1 To iLastRow
str_CellString = Cells(iRow, iColumn).Value
If Len(str_CellString) > 43 Then
If Mid(str_CellString, 43, 1) = " " Then
Cells(iRow, iColumn).Value = Left(str_CellString, 43) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - 43)
Else
If Mid(str_CellString, 43, 1) <> " " Then
iChar = InStrRev(str_CellString, " ", 43)
Cells(iRow, iColumn).Value = Left(str_CellString, iChar) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - iChar)
End If
End If
End If
Next iRow
End Sub
Sorry, I guess I should have provided more detail. I whipped this up as my eyes were ready to bleed out of their sockets, so I sorta glossed over it.
First of all, this is an actual Excel file. I do localization for a game company, and the Japanese developer sent us this Excel with all the game text in it. So basically, each cell is a box of dialogue in the game. There's about 8500 separate cells, and each cell can have up to 3 line breaks (#n) to appear properly in the game. The max character limit for each line is 42 characters, so only if the 43rd character is a space can there be a break; otherwise, it has to break before that basically.
I don't know what voodoo the Japanese devs do with the Excel file when we send it to them, but it's our job to insert the line breaks ourselves. I probably didn't explain this very well because I have absolutely no programming/formula skill or anything. I'm gonna dive into a few of these formulas and see what happens. Thanks for the help!
Edit: I just realized another hitch that probably throws everything out of whack. Each individual cell will contain up to 3 42-character length lines of text, so after the #n is inserted there can still be more text after that that needs another #n. Hrrrmm...
Edit: I just realized another hitch that probably throws everything out of whack. Each individual cell will contain up to 3 42-character length lines of text, so after the #n is inserted there can still be more text after that that needs another #n. Hrrrmm...
Well, we can work with that. Did the formula in the cells work for you? Or did the macros work better?
I'll update my formulas to do what you want if you want, gimme a few minutes.
Edit: Here you go. Reformatted the spreadsheet so that it handles up to 3 42-character length lines.
If you're happy to use formulas and cells, just copy the equations from the line highlighted in purple into your spreadsheet from here, and then copy down the columns.
I've got it so that it throws you an error in the final column (highlighted grey) if it:
1.) Cannot find a space within 43 characters
2.) The last line can't fit into 42 characters
What do you want to happen when a line fits into less than 43 characters?
e.g. see the "This is short." line
Did you want your output to go:
"This is short.#n"
or
"This is short.#n#n#n"
?
Edit2: At this stage, it might be getting too complicated for formulas, and if the problem got any more complex it would probably be best handled via Excel macros. I don't actually have Excel on this machine, so I shall leave it to the fine minds of Pheezer and Drez to do the equivalent solution.
The macro I gave addresses a single cell with an unknown amount of text that could require thousands of #n's if necessary. Looping it across a range is trivial. If I'm understanding Drez correctly his would accomplish the same.
Since Drez actually has excel installed, I'd ask that he take a cut at it.
Pheezer on
IT'S GOT ME REACHING IN MY POCKET IT'S GOT ME FORKING OVER CASH
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
I hate to say this guys, but I'm sick, just woke up, am getting dragged out to Christmas Eve dinner, and won't be around until Friday afternoon/night as I'm going away tomorrow.
If somehow I get a chance to respond, I will, but please don't count on me. What you want to do though, Houk, is certainly doable.
Sorry guys, I meant to try this stuff out earlier but things got pretty busy at work. I might have to leave this for later during xmas break (no hurry on this since the files aren't due till after the new year). But I'm sure something here should work.
Ecco - ideally, if the total line is shorter than 43 characters it would just leave off the #n completely, but since that shouldn't affect the text in the final game, it doesn't really matter.
Thanks a ton for the help guys, sorry I can't validate your input faster
Ecco - ideally, if the total line is shorter than 43 characters it would just leave off the #n completely, but since that shouldn't affect the text in the final game, it doesn't really matter.
Sorry, I guess I should have provided more detail. I whipped this up as my eyes were ready to bleed out of their sockets, so I sorta glossed over it.
First of all, this is an actual Excel file. I do localization for a game company, and the Japanese developer sent us this Excel with all the game text in it. So basically, each cell is a box of dialogue in the game. There's about 8500 separate cells, and each cell can have up to 3 line breaks (#n) to appear properly in the game. The max character limit for each line is 42 characters, so only if the 43rd character is a space can there be a break; otherwise, it has to break before that basically.
I don't know what voodoo the Japanese devs do with the Excel file when we send it to them, but it's our job to insert the line breaks ourselves. I probably didn't explain this very well because I have absolutely no programming/formula skill or anything. I'm gonna dive into a few of these formulas and see what happens. Thanks for the help!
Edit: I just realized another hitch that probably throws everything out of whack. Each individual cell will contain up to 3 42-character length lines of text, so after the #n is inserted there can still be more text after that that needs another #n. Hrrrmm...
So what you're saying is this: The text in each of the 8,500 cells comes in a single line with no carriage return within the cell either, right? I mean it's a continous string within the cell itself?
And the string can be a maximum of 42 + 42 + 42 characters (which includes punctuation and spaces, but NOT the #n newline character, yes?
And are all the cells in a single column? If so that's very easy to do.
I'm also assuming that if the 43rd and 85th characters are spaces, you want to truncate the space. In other words, you don't want the new line to begin with a space (and you don't want the end of the last line to end with a space), correct?
So in other words:
Hi, how are you doing today my friend? The character before the word character was a space.
Should become:
Hi, how are you doing today my friend? The#ncharacter before the word character was a#nspace.
Sub Insert_Stuff_At_43rd_Character_Loop_2()
Dim str_CellString As String
Dim iChar, iRow, iLastRow As Integer
Const str_TextToInsert As String = "#n" 'this is the text that will be inserted at the 43rd character
Const iColumn As Integer = 1 'this is the column in which your text resides, change it as necessary
iLastRow = Range("A65536").End(xlUp).Row
For iRow = 1 To iLastRow
str_CellString = Cells(iRow, iColumn).Value
If Len(str_CellString) > 42 Then
If Mid(str_CellString, 43, 1) = " " Then
Cells(iRow, iColumn).Value = Left(str_CellString, 42) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - 43)
Else
If Mid(str_CellString, 43, 1) <> " " Then
iChar = InStrRev(str_CellString, " ", 43)
Cells(iRow, iColumn).Value = Left(str_CellString, iChar) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - iChar)
End If
End If
str_CellString = Cells(iRow, iColumn).Value
If Len(str_CellString) > 85 Then
If Mid(str_CellString, 86, 1) = " " Then
Cells(iRow, iColumn).Value = Left(str_CellString, 85) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - 86)
Else
If Mid(str_CellString, 86, 1) <> " " Then
iChar = InStrRev(str_CellString, " ", 86)
Cells(iRow, iColumn).Value = Left(str_CellString, iChar) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - iChar)
End If
End If
End If
End If
Next iRow
End Sub
man you guys are utterly badass. im working on other stuff at work right now, but ill probably give this a try later tonight when im busting my ass...at home...on christmas eve.
edit: oops, dunno why i double-posted instead of editing. oh well, that's how awesome you are!
Had a slight bug. It wasn't deleting the space if the 85th character fell on something other than a space. The following works:
Sub Insert_Stuff_At_43rd_Character_Loop_2()
Dim str_CellString As String
Dim iChar, iRow, iLastRow As Integer
Const str_TextToInsert As String = "#n" 'this is the text that will be inserted at the 43rd character
Const iColumn As Integer = 1 'this is the column in which your text resides, change it as necessary
iLastRow = Range("A65536").End(xlUp).Row
For iRow = 1 To iLastRow
str_CellString = Cells(iRow, iColumn).Value
If Len(str_CellString) > 42 Then
If Mid(str_CellString, 43, 1) = " " Then
Cells(iRow, iColumn).Value = Left(str_CellString, 42) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - 43)
Else
If Mid(str_CellString, 43, 1) <> " " Then
iChar = InStrRev(str_CellString, " ", 43)
Cells(iRow, iColumn).Value = Left(str_CellString, iChar - 1) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - iChar)
End If
End If
str_CellString = Cells(iRow, iColumn).Value
If Len(str_CellString) > 85 Then
If Mid(str_CellString, 86, 1) = " " Then
Cells(iRow, iColumn).Value = Left(str_CellString, 85) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - 86)
Else
If Mid(str_CellString, 86, 1) <> " " Then
iChar = InStrRev(str_CellString, " ", 86)
Cells(iRow, iColumn).Value = Left(str_CellString, iChar - 1) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - iChar)
End If
End If
End If
End If
Next iRow
End Sub
Just plop it into the VBE in Excel and you're good to go.
jesus christ in a picnic basket, damn thing works like a charm.
for some reason i have no idea why, the macro doesnt work when i swap out the column number 1 (column a) with column number 9 (column i, where the text is). but it DOES work with the default setting, so all i have to do is copy the I column into the A column of another sheet, run the macro, and copy it back
this is fucking aces
just for kicks ill probably play around with the other macros/formulas posted here to see how they go, but man...
Seriously Drez, you just saved me about 25 hours on this project alone, not to mention the uncountable hours on future projects. you are da man
Also note that if you work for a dev (and we both know you do) you probably have access to someone who could whip up this same code, using what I gave as a model to save time explaining yourself, using a far quicker option.
If you were to do this with a text file and write the code in a real programming language rather than an embedded macro language you would find that the code executes about a million times faster. I work with a lot of VBA code written to automate spreadsheet tasks and to assemble reports and things and stuff like the project you're looking at tends to run very slowly in VBA.
Of course, if it takes half an hour to run in VBA and would only take 3 minutes to run in C++, either are pretty shit hot options to spending 3 weeks doing it by hand.
Oh and I missed this. The problem is that all the dev work goes on over in Japan, all we have in-house is tertiary stuff (producers, production staff, etc). And for the most part, it seems like Japan is pretty slow/ineffecient at getting stuff like this done.
Honestly I would kill to have a better option than doing all this in Excel and relying on macro stuff, but sadly that's not an option with the way they operate here. We actually shot them an e-mail yesterday morning before I posted this, but I'm pretty sure they're actually off for the rest of the year, so we never would have heard anything back.
Drez, I don't know if this helps you or not but look at how I avoided ever having a string longer than 43 characters. When I have to insert a break based on the presence of a prior space instead of the current one, I look at the character position that break is at, and set my next limit point at that + 43.
To get it to work correctly for houk I'd have to fix it in light of more recent info he's provided, but it sounds like you're closer to getting it finished anyhow and I suspect your version will run faster than my nested loops.
Pheezer on
IT'S GOT ME REACHING IN MY POCKET IT'S GOT ME FORKING OVER CASH
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
Yeah then I know it's way faster than mine. Iteration and decision making are horribly slow in VBA, the provided functions are way, way faster because they're running compiled code from within Excel rather than interpreting mountains of VBA.
I'm somewhat stoked to learn about instrrev, though. I didn't know about that one. I guess it's somewhat limited in application, but still cool to know.
Pheezer on
IT'S GOT ME REACHING IN MY POCKET IT'S GOT ME FORKING OVER CASH
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
See, I'm lucky: All of the data I have to manipulate comes from an Oracle database that enforces all manner of strict formatting rules on the shit stored in it. Nothing I do requires finding anything within a string, it's all about quantifying and categorizing data that's been at least halfway sanitized to begin with.
Pheezer on
IT'S GOT ME REACHING IN MY POCKET IT'S GOT ME FORKING OVER CASH
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
So I don't see a fixed formula. So I fixed it for Drez, and added a feature from Pheezers code.
This will replace the space with the page break #n, instead of leaving the space in the code. (I can change that if needed.)
It will error check the lenght after the second #n, so if you have a line that is properly broken into 3 segments, and the last segment turns out to be greater than 42 characters long it will both highlight the row, and add a line in the column directly to the right of it that says "Last line is greater than 42."
You can delete or change that as you will, it is one of the last lines in the code.
Sub Insert_Stuff_At_43rd_Character_Loop_Edited()
Dim str_CellString, str_CellSubString, str_CellCheckString As String
Dim iChar, iRow, iLastRow As Integer
Const str_TextToInsert As String = "#n" 'this is the text that will be inserted at the 43rd character
Const iColumn As Integer = 1 'this is the column in which your text resides, change it as necessary
iLastRow = Range("A65536").End(xlUp).Row
For iRow = 1 To iLastRow
str_CellString = Cells(iRow, iColumn).Value
If Len(str_CellString) > 42 Then
If Mid(str_CellString, 43, 1) = " " Then
Cells(iRow, iColumn).Value = Left(str_CellString, 42) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - 43)
Else
If Mid(str_CellString, 43, 1) <> " " Then
iChar = InStrRev(str_CellString, " ", 43)
Cells(iRow, iColumn).Value = Left(str_CellString, iChar - 1) & str_TextToInsert & Right(str_CellString, Len(str_CellString) - iChar)
End If
End If
str_CellString = Cells(iRow, iColumn).Value
If Len(str_CellString) - (InStrRev(str_CellString, "#n") + 1) > 42 Then
str_CellSubString = Application.WorksheetFunction.Replace(str_CellString, 1, InStrRev(str_CellString, "#n") + 1, "")
If Mid(str_CellSubString, 43, 1) = " " Then
Cells(iRow, iColumn).Value = Left(str_CellString, (InStrRev(str_CellString, "#n") + 1)) & Left(str_CellSubString, 42) & str_TextToInsert & Right(str_CellSubString, Len(str_CellSubString) - 43)
Else
If Mid(str_CellSubString, 43, 1) <> " " Then
iChar = InStrRev(str_CellSubString, " ", 43)
Cells(iRow, iColumn).Value = Left(str_CellString, (InStrRev(str_CellString, "#n") + 1)) & Left(str_CellSubString, iChar - 1) & str_TextToInsert & Right(str_CellSubString, Len(str_CellSubString) - iChar)
End If
End If
End If
End If
str_CellString = Cells(iRow, iColumn).Value
str_CellCheckString = Application.WorksheetFunction.Replace(str_CellString, 1, InStrRev(str_CellString, "#n") + 1, "")
If Len(str_CellCheckString) > 42 Then
Cells(iRow, iColumn).Interior.ColorIndex = 36
Cells(iRow, iColumn + 1).Value = "Last line is greater than 42."
End If
Next iRow
End Sub
And it does about 1000 lines a second on my peeetiful PC.
Holy thread camping Drez. Fess up. You were just hiding until someone did it for you, weren't you? :P
Nope, I just happened to be posting in H/A about New York State auditing me and trying to fuck me out of thousands of dollars at the same time this bumped up.
I see that thread, and ouch. As a not tax expert, nor a livin in NY persona, I'mma just stay here in this thread, where its safe. Best o luck with that.
Posts
Actually, the way the OP wrote it, it's just a matter of finding the "last space" in a set of 43 characters. The logic is start at the last character and worked backwards until you find the space, then insert the #n.
I assume you have a column of cells containing 43 characters each? If so, the logic would follow:
(Pseudo-code)
If you're not a programmer and don't know how a for loop works let me know. I thought about writing a macro for you but I'm too out of touch with VBA :P
It sounds a bit to me like what you have is not really an Excel-file that needs to be handled but perhaps simply a list which needs to broken down into a list with items separated by line changes?
Is the file actually in Excel format or is it just really a file that for some reasons is opened by Excel. If it's the later you might have an easier time opening in something like Textpad (Advanced shareware text editor) and then solve you need by using it's advance search&replace function.
Will it always have at least one space in the first 43 characters?
If you're happy to accept cell formulas and can guarantee that there's at least one space in the first 43 characters, then I think it might be possible by doing some trickery using stuff from here and a few formulas in adjacent cells.
In fact, I've whipped a quick demo up with the formulas here.
Your technique is correct but implementation depends on how his data is arranged in the spreadsheet.
OP, you need to explain more about the structure of the data in your Excel file. Is it 20,000 rows of text in column A? Or is all of the text in cell A1? Or what's the situation there?
If you can tell me how the text is laid out I can whip this up for you super quick.
Actually, fuck it. Here's VBA code that should probably run (on a mac without office, too lazy to google all of my syntax, pretty sure it's good though just tell me if you get errors and on which line and I'll figure it out as we go).
This is what you'd need to do to insert as many #n's as necessary for a string of whatever length is encountered.
The only substitutions you need to make are on the line that starts out with apostrophe. Remove the apostrophe and set the variable there to be equal to whatever it needs to equal. If the text is in cell A1, the easiest way to reference that is "activesheet.range("A1").value", but if you have a whole column full of cells that have text in them of varying lengths, we need to wrap a loop around the entire snippet below.
If that is the case, let me know which column the data will be in and I'll do the rest.
So what we do here is loop until the current position in the text exceeds the length of the text.
Then we enter into a loop that we follow until we hit the predefined limit (43 in this case)
While in the inner loop, we look for space characters and each time we find one, we place an anchor on it.
After we've run up to the limit of 43 characters we check whether the character we're on is a space character or not. If it is, then we insert the #n where we are. If it's not, we insert the #n back where our anchor is (the last space we'd encountered).
Then we take where ever we put our #n, and count 43 characters out from there and define that as our next limit, and repeat the inner loop and subsequent #n insertion check.
We keep doing these things until we've gotten to the point where 43 characters further would place us beyond the end of the text, and at that point skip out of the loops.
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
If you were to do this with a text file and write the code in a real programming language rather than an embedded macro language you would find that the code executes about a million times faster. I work with a lot of VBA code written to automate spreadsheet tasks and to assemble reports and things and stuff like the project you're looking at tends to run very slowly in VBA.
Of course, if it takes half an hour to run in VBA and would only take 3 minutes to run in C++, either are pretty shit hot options to spending 3 weeks doing it by hand.
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
This performs on the active cell (i.e. wherever your cursor is) but it can be automated to loop through an entire column if that's better.
Personally, I prefer to nest my IF statements. And the "Exit Sub" lines are wholly unnecessary. I wrote it this way so it's easier for you to read and process. A more concise method would be:
But they both do the same thing.
And of course you should change "MOOGLYBOOGLY" to whatever text you want to insert. The code above also assumes you want a space before and after the inserted text.
First, determine if the string is less than 43 characters long. If so, add a space, append the text to the cell, and exit the macro.
Second, determine if the 43rd character is a space. If so, append the text at the 43rd character in the cell, add a space, and exit the macro.
Third, verify that the 43rd character is not a space (not entirely necessary at this point but it's good practice to check anyway). If so, use the InStrRev function to find the most recent previous space (starting at the 43rd character and looking backward), append the text at this location, add a space, and exit the macro.
First of all, this is an actual Excel file. I do localization for a game company, and the Japanese developer sent us this Excel with all the game text in it. So basically, each cell is a box of dialogue in the game. There's about 8500 separate cells, and each cell can have up to 3 line breaks (#n) to appear properly in the game. The max character limit for each line is 42 characters, so only if the 43rd character is a space can there be a break; otherwise, it has to break before that basically.
I don't know what voodoo the Japanese devs do with the Excel file when we send it to them, but it's our job to insert the line breaks ourselves. I probably didn't explain this very well because I have absolutely no programming/formula skill or anything. I'm gonna dive into a few of these formulas and see what happens. Thanks for the help!
Edit: I just realized another hitch that probably throws everything out of whack. Each individual cell will contain up to 3 42-character length lines of text, so after the #n is inserted there can still be more text after that that needs another #n. Hrrrmm...
Well, we can work with that. Did the formula in the cells work for you? Or did the macros work better?
I'll update my formulas to do what you want if you want, gimme a few minutes.
Edit: Here you go. Reformatted the spreadsheet so that it handles up to 3 42-character length lines.
If you're happy to use formulas and cells, just copy the equations from the line highlighted in purple into your spreadsheet from here, and then copy down the columns.
http://spreadsheets.google.com/pub?key=pplXt6VPMU-mOb5CjM3voGw
I've got it so that it throws you an error in the final column (highlighted grey) if it:
1.) Cannot find a space within 43 characters
2.) The last line can't fit into 42 characters
What do you want to happen when a line fits into less than 43 characters?
e.g. see the "This is short." line
Did you want your output to go:
"This is short.#n"
or
"This is short.#n#n#n"
?
Edit2: At this stage, it might be getting too complicated for formulas, and if the problem got any more complex it would probably be best handled via Excel macros. I don't actually have Excel on this machine, so I shall leave it to the fine minds of Pheezer and Drez to do the equivalent solution.
Since Drez actually has excel installed, I'd ask that he take a cut at it.
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
If somehow I get a chance to respond, I will, but please don't count on me. What you want to do though, Houk, is certainly doable.
Ecco - ideally, if the total line is shorter than 43 characters it would just leave off the #n completely, but since that shouldn't affect the text in the final game, it doesn't really matter.
Thanks a ton for the help guys, sorry I can't validate your input faster
Done and done.
No worries. Merry Christmas!
So what you're saying is this: The text in each of the 8,500 cells comes in a single line with no carriage return within the cell either, right? I mean it's a continous string within the cell itself?
And the string can be a maximum of 42 + 42 + 42 characters (which includes punctuation and spaces, but NOT the #n newline character, yes?
And are all the cells in a single column? If so that's very easy to do.
So in other words:
Hi, how are you doing today my friend? The character before the word character was a space.
Should become:
Hi, how are you doing today my friend? The#ncharacter before the word character was a#nspace.
Right?
edit: oops, dunno why i double-posted instead of editing. oh well, that's how awesome you are!
Just plop it into the VBE in Excel and you're good to go.
for some reason i have no idea why, the macro doesnt work when i swap out the column number 1 (column a) with column number 9 (column i, where the text is). but it DOES work with the default setting, so all i have to do is copy the I column into the A column of another sheet, run the macro, and copy it back
this is fucking aces
just for kicks ill probably play around with the other macros/formulas posted here to see how they go, but man...
Seriously Drez, you just saved me about 25 hours on this project alone, not to mention the uncountable hours on future projects. you are da man
Give me a few minutes.
The problem is this: Currently the code may allow a second line of greater than 42 characters. It's a slight math problem. Hold on.
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
Backlog Wars - Sonic Generations | Steam!
Viewing the forums through rose colored glasses... or Suriko's Ye Old Style and The PostCount/TimeStamp Restoral Device
H/A : We have a crack team of programmers, debuggers, and testers on call, 24/7, 365 days of the year? :P
Hahaha, oh man.
Honestly I would kill to have a better option than doing all this in Excel and relying on macro stuff, but sadly that's not an option with the way they operate here. We actually shot them an e-mail yesterday morning before I posted this, but I'm pretty sure they're actually off for the rest of the year, so we never would have heard anything back.
Drez, I don't know if this helps you or not but look at how I avoided ever having a string longer than 43 characters. When I have to insert a break based on the presence of a prior space instead of the current one, I look at the character position that break is at, and set my next limit point at that + 43.
To get it to work correctly for houk I'd have to fix it in light of more recent info he's provided, but it sounds like you're closer to getting it finished anyhow and I suspect your version will run faster than my nested loops.
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
I'm somewhat stoked to learn about instrrev, though. I didn't know about that one. I guess it's somewhat limited in application, but still cool to know.
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
I'm also a fan of using Regular Expressions but it's just easier to go this route.
Houk, I'm on the track. Worst case scenario is I have the foolproof solution for you early tomorrow. I have one procedural thing to puzzle out.
CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
This will replace the space with the page break #n, instead of leaving the space in the code. (I can change that if needed.)
It will error check the lenght after the second #n, so if you have a line that is properly broken into 3 segments, and the last segment turns out to be greater than 42 characters long it will both highlight the row, and add a line in the column directly to the right of it that says "Last line is greater than 42."
You can delete or change that as you will, it is one of the last lines in the code.
And it does about 1000 lines a second on my peeetiful PC.
Backlog Wars - Sonic Generations | Steam!
Viewing the forums through rose colored glasses... or Suriko's Ye Old Style and The PostCount/TimeStamp Restoral Device
Backlog Wars - Sonic Generations | Steam!
Viewing the forums through rose colored glasses... or Suriko's Ye Old Style and The PostCount/TimeStamp Restoral Device
Nope, I just happened to be posting in H/A about New York State auditing me and trying to fuck me out of thousands of dollars at the same time this bumped up.
Backlog Wars - Sonic Generations | Steam!
Viewing the forums through rose colored glasses... or Suriko's Ye Old Style and The PostCount/TimeStamp Restoral Device