The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.

C programming problem

durandal4532durandal4532 Registered User regular
edited November 2007 in Help / Advice Forum
So, I've been going slowly insane.

I wrote a program that is basically a fake OS. In it, I have a fake Process Control Block. One of the functions in the OS is "RUN", which adds a PCB struct to a list with a pointer to a textfile, and then when you input "GO", the script in that textfile is executed.

Now, that worked fine not two weeks ago.

Yesterday, I tried doing the new assignment's programming part, which involves slightly changing the PCB struct. Basically, adding a process ID field, an int.

so my change was to go from

typedef struct PCB_REC {
char* filename;
FILE* process;
struct PCB_REC* next;
}PCB_REC;

to

typedef struct PCB_REC {
int pid;
char* filename;
FILE* process;
struct PCB_REC* next;
}PCB_REC;

This caused a seg fault every time I used "GO".

Now, I decided to go back to my original and use that to figure out what could be going wrong. That worked for a bit, but got me no closer to figuring out the reason. Then I defragged and restarted because my computer was running insanely slow. Now, the original code doesn't work either, in exactly the same manner.

So, basically: anyone know whether it could have something to do with just my environment? Am I missing something basic? Why would adding a field to a struct cause a seg fault?

We're all in this together
durandal4532 on

Posts

  • DrFrylockDrFrylock Registered User regular
    edited November 2007
    In C, a wild pointer can go off and blow away memory in the most random of places. Thus, it could be that a bug has always existed in some other part of your code that causes a pointer to read from or write to memory it's not supposed to, and only the particular layout of data in memory that the compiler was creating was keeping things "running" (maybe there was also a latent problem that didn't cause a crash, but caused some data corruption that you hadn't detected yet). Adding a field to the struct slightly changed the data layout and so your heretofore-undiscovered wild pointer is going off and doing even more damage now.

    This may also explain why things seem to act nondeterministically. I'd run the code through a debugger and keep a good watch on your pointer operations and see if something's going haywire.

    DrFrylock on
  • SmasherSmasher Starting to get dizzy Registered User regular
    edited November 2007
    You need to pin down exactly when it's crashing. It could be when it's traversing the PCB list, when it's reading the text file, when it's attempting to actually execute the script, or even some other point altogether.

    My feeling as to why it was working before and not after you added the field is that there was some subtle error in your code before, but for whatever reason you got lucky and it wasn't breaking anything visibly. I'd say the top two suspects are that you're trying to dereference an uninitialized or NULL pointer, or that you made some other small change that broke something and forgot about it.

    Smasher on
  • JaninJanin Registered User regular
    edited November 2007
    Post the entire code. If you have to, zip it up and upload to Mediafire.

    Janin on
    [SIGPIC][/SIGPIC]
  • iconduckiconduck Registered User regular
    edited November 2007
    Try adding your int to the end of the structure instead of the beginning and see what happens.

    My guess is that you're not doing a clean compile, and that there are still some components that are using the old definition of the struct instead of the new definition.

    Given the buffer 00000001 7fffff34 7fffff30 7fffff58

    Your new code sees:
    pid = 1
    filename = 7fffff34 (valid memory)
    process = 7fffff30 (valid memory)
    next = 7fffff58 (valid memory)

    The old components see:
    filename = 00000001 (invalid memory)
    process = 7fffff34 (valid, but not correct)
    next = 7fffff30 (valid, but not correct)

    iconduck on
  • UndefinedMonkeyUndefinedMonkey Registered User regular
    edited November 2007
    If it's a segfault, it's a C program, and it's intermittent (that is, it works part of the time), it's usually memory corruption.

    I don't have the code in front of me, but here's my gut feeling on what's happening. Somewhere, you're working with a pointer to this struct (or within this struct) that isn't initialized correctly. Either that or you're indexing into your struct in some strange way that doesn't take this new field into account (less likely, but still probable.)

    If it's the former, things worked for a little while because the memory you were corrupting wasn't tied to anything critical, or you had enough "padding" in your program to contain the corruption. Adding a new field into the struct changed this because the int field pushed things beyond your "safe zone" in the memory.

    If it's the latter, you're grabbing the wrong section of memory when you index into your struct... for example, you're grabbing the PID as part of your filename, which could cause all kinds of problems.

    The defragging thing is definintely puzzling.

    I'd go through this thing with a fine-toothed comb. Start up your debugger and step through the program, line by line. Find where the program dies and work your way back from there. Watch all of your variables for weirdness. Weirdness includes (but is not limited to): changing one variable and overwriting part of another somewhere else in memory; assigning a new value to a pointer with a non-null value; accessing members of a NULLed or non-initialized pointer; garbled or nonsensical values in structures (especially strings.)

    edit: beat'd

    UndefinedMonkey on
    This space intentionally left blank.
  • durandal4532durandal4532 Registered User regular
    edited November 2007
    Okay, thank you all for the help, but this can be locked.

    It...

    IT WAS AN ==. I missed an = somewhere. How the hell I managed to test-run earlier I do not know.

    Let this be a lesson: look for all of the really really simple stuff first.

    Good lord I'm embarrassed now.

    durandal4532 on
    We're all in this together
  • SmasherSmasher Starting to get dizzy Registered User regular
    edited November 2007
    :lol:

    A neat trick I've heard to prevent that problem is, whenever you're doing an equality comparison and there's an rvalue involved (in other words, an expression or literal or something else that you can't assign a value to), put it on the left side. So if you'd normally write if(val == 1), do if(1 == val) instead. That way if you mess up and use = instead of == you get a compile time error instead of subtly broken code.

    Was your particular error that you assigned NULL to a pointer instead of comparing it?

    Smasher on
  • durandal4532durandal4532 Registered User regular
    edited November 2007
    Yes, yes it was. Also, nice trick, I'll have to start doing that.

    durandal4532 on
    We're all in this together
Sign In or Register to comment.