Declared in | GITObject+Parsing.h |
---|
This category provides parsing abilities to the GITObject class. These are intended to be used by the GITObject
descendent classes to parse their contents.
Parsing Known Length String
In some cases you will know prior to parsing the length of the string you need to find. For example we have a buffer "foo bar\n"
, in this case we want the "bar"
string. But we don't know its "bar"
, but we do know that it is three characters long, is prefixed with "foo "
and ends with "\n"
. To detect the starting point of "bar"
and how long it is we need to use a parsingRecord
structure { "foo ", 4, 4, 3, '\n' }
. So we have the starting pattern "foo "
and the length of that pattern which is 4. The third field indicates that matching needs to start from the 4th character in the buffer. The fourth field indicates how long the match will be and the last field is the terminating character, in this case a new line. Given the match length is known an error will be returned if the ending character does not match the character matchLen
characters from the start position.
Parsing Unknown Length Strings
When the length of the target string is not known the use is similar to the known length parsing with the exception that the .startLen
field of the parsingRecord
will be set to zero to allow for automatic determination of the end of the match based on the .endChar
field. Taking the known length example but the length of "bar"
is unknown but the ending character is known. In this case the parsingRecord
structure would need to be initialised to the following { "foo ", 4, 4, 0, '\n' }
to permit the detection of the ending character to terminate the match.
Parsing Ends of Strings
Parsing the ends of strings has one primary limitation at this time, the length of the suffix must be known. This is due to the way in which the .startLen
field of the parsingRecord
structure is used, this limitation may be removed if there is a great need for it. The parsing of string suffixes uses a negative value .startLen
field and the ending character to find the end and thereby the start of the match. The .matchLen
field may be specified to enable matching only a specific set of the characters from the determined match start. As an example we have a string "foo bar 1262562908 +0000\n"
and the desired match is 1262562908
. Additionally we only know the prefix of the string is "foo "
and the "bar"
section can be of any length. In this case we have to make a match on the end of the string. To obtain this match a parsingRecord
structure of { "foo ", 4, -17, 10, '\n' }
would be specified. The .startLen
causes the determined starting point to fall on the 1
of 1262562908
and the .matchLen
limits the match end to the '
' character after the 8
in 1262562908
. If the "+0000"
was later wanted a similar parsingRecord
structure of { "foo ", 4, -7, 5, '\n' }
would be specified, though the .matchLen
could also be set to 0 in this case.
Parsing Strings without a Known End Character
There are also cases where the length of the match is known but there is no end terminating character, an example of this is `tree` objects in git repositories. These consist of a number of entries all butted up together with a known length string as the last part of the entry. In this case the .endChar
can be set to -1
. This prevents the endChar
from being matched and also prevents the provided buffers position from being moved past the end of the matched string. To match a string like this a parsingRecord
structure such as { "foo ", 4, 4, 20, -1 }
to match the the last twenty characters of the string starting with "foo "
.
- (GITObjectHash *)newObjectHashWithObjectRecord:(parsingRecord)record bytes:(const char **)bytes
Record describing the string to match
Pointer to the byte stream to search. This byte stream should be either an unpacked sha1 string (40 bytes) or packed SHA1 data (20 bytes)
GITObjectHash
object matching the record or nil if no match
Creates and returns a GITObjectHash
matching the specified record
This method creates a GITObjectHash directly from the raw bytes corresponding to the git object id record. It bypasses the unnecessary step of allocating and initializing an NSString
object, which is expensive when called repeatedly during high-load git object graph parsing operations. During object parsing, the expected type of the underlying data is typically known, and this method allows that information to be utilized for performance gains.
Use this method to directly parse the git object id (hash) record for an object.
GITObject
+Parsing.h
- (NSString *)newStringWithObjectRecord:(parsingRecord)record bytes:(const char **)bytes
Record describing the string to match
Pointer to the byte stream to search
string matching the record or nil if no match
Creates and returns a string matching the format defined by the record.
GITObject
+Parsing.h
- (NSString *)newStringWithObjectRecord:(parsingRecord)record bytes:(const char **)bytes encoding:(NSStringEncoding)encoding
Record describing the string to match
Pointer to the byte stream to search
NSString
Encoding to interpret the bytes with when creating the string
string matching the record or nil if no match
Creates and returns a string matching the format defined by the record.
GITObject
+Parsing.h
Last updated: 2011-2-20