Wednesday, April 24, 2013

Using YAJL with Objective C

I have been tasked with getting an existing iOS code base ready to be released to the App Store. After loading the existing project into Xcode and resolving static library link issues I was able to run the program but not log in. I added some #ifdef DEBUG processing to allow me to log into a server without a valid certificate and got to play around in the app on the simulator.

I was quickly able to crash the app and after setting the debugger up to stop on all exceptions I found it was running out of memory. Boy, if you can run the simulator out of memory you know you can easily do that on a device.

Looking through the code I found it was using NSJSONSerialization for all JSON processing. That library is fine and it is provided by Apple so you know it is legal to use but when you have large amounts of data it is not super efficient. If you have used XML in the past and are just getting into JSON this is similar to using DOM vs SAX. DOM is great and easy and hands you a nice tree of data but if all you are doing is plucking things out of that tree to put in another format you are better off using SAX especially if there are memory concerns.

The code was parsing everything into a big NSDictionary based tree with 13,318 base records and tons more child records. This is just not going to work on a memory limited device.

I looked up information on stream based JSON parsers and landed on YAJL which you can find at https://github.com/gabriel/yajl-objc. I won't cover how to configure it for Xcode as all that information is on the base site. It was easy to set up for both Mac OSX (my test program) and iOS.

They show a streaming sample that was too basic to be very helpful. Hopefully I can fill in some of the gaps. The first issue with their sample is they read the WHOLE file into memory via NSData then run it through the parser. Not fully streaming at that point. Second they show no samples on what to do in the callbacks.

The code snippet below will help you parse your JSON by keeping track of where you are in the data. It will keep a stack of named dictionaries and arrays as you parse through the data. You can then create and act on the proper objects as you get map keys and values.

It also reads in the JSON file in chunks instead of tossing the whole things into NSData. In my final iOS code I am using a NSURLConnection and parsing the chunks in the didReceiveData call.

The original code ran out of memory when receiving 38.9 meg of data and took 5 seconds to run. I put in some cheater try / catch blocks to let it "finish" without crashing for timing results. The new code does not run out of memory and runs in 2 seconds. Of course this is all in the simulator on a fast MacBook Pro with a lot of RAM. I don't currently have a device to test on but will get one soon.

I looked around on the web for a long time before I was able to piece together enough knowledge to pull this off so hopefully it will help others get up to speed a lot faster.

Things that go in the header file

// Array used as a stack to keep track of where we are
NSMutableArray *stack; // create this in init method
NSString *mapKey;

// Actual class code

- (id) init
{
    self = [super init];
    if (self != nil) {
        stack = [[NSMutableArray alloc] init];
    }
    return self;
}


// Code to parse the file in chunks

- (void) parseFile:(NSString *)jsonFileName
{
    YAJLParser *parser = [[YAJLParser alloc] initWithParserOptions:YAJLParserOptionsAllowComments];
    parser.delegate = self;
    uint64 offset = 0;
    uint32 chunkSize = 10240;     //Read 10KB chunks.
    NSFileHandle *handle = [NSFileHandle fileHandleForReadingAtPath:jsonFileName];
    NSData *data = [handle readDataOfLength:chunkSize];
    
    NSLog(@"Starting parse of %@...", jsonFileName);
    [stack removeAllObjects];
    mapKey = nil;
    while ([data length] > 0)
    {
        [parser parse:data];
        if (parser.parserError)
        {
            NSLog(@"Error:\n%@", parser.parserError);
        }
        offset += [data length];
        
        [handle seekToFileOffset:offset];
        data = [handle readDataOfLength:chunkSize];
    }
    
    [handle closeFile];
    parser.delegate = nil;
}

// Stack adds newly named object OR last object on stack
- (void)parserDidStartDictionary:(YAJLParser *)parser
{
    NSString *dictName = mapKey;
    if (mapKey == nil)
    {
        dictName = (stack.count == 0) ? @"" : [stack lastObject];
    }
    [stack addObject:(dictName)];

    // Create new object here based on dictionary name
    // ready to hold soon to come key + value pairs
}

// End of dictionary - pop off stack
- (void)parserDidEndDictionary:(YAJLParser *)parser
{
    mapKey = nil;
    [stack removeLastObject];

    // Save object created in start Dictionary to
    // array or core data here
}

// New array staring, push name onto stack (use previous if we don't have one)
- (void)parserDidStartArray:(YAJLParser *)parser
{
    NSString *arrayName = mapKey;
    if (mapKey == nil)
    {
        arrayName = stack.count == 0 ? @"" : [stack lastObject];
    }
    [stack addObject:(arrayName)];
}

// End of array, pop off stack
- (void)parserDidEndArray:(YAJLParser *)parser
{
    mapKey = nil;
    [stack removeLastObject];
}

- (void)parser:(YAJLParser *)parser didMapKey:(NSString *)key
{
    mapKey = key;
    // Do processing here for each map key
    // Something like
    // if (!inResults && [key isEqualToString:@"results"])
    // {
    //    inResults = true;
    // }
}

- (void)parser:(YAJLParser *)parser didAdd:(id)value
{
   // Do processing here for each value (will have a mapKey)
   // Something like

   // if (inResults && [mapKey isEqualToString:@"query"])
   // {
   //   query = ((NSString *)value);
   // }
   //
   // value will be (NSString *), (NSNull *) or (NSNumber *)
   // booleans come as (NSNumber *) 0 = false, 1 = true

}


No comments:

Post a Comment