Here's a snippet of information I'm pulling from a device, within which I want to grab certain parts:
* Release
Model <model>
Factory IP Address <factory ip>
Hardware T1r1.1.UU, 520 MHz, 128 MByte RAM
Image Sensor and Lens b/w (F/2.0), color (F/2.0)
Software <software> (2007-12-20)
* Networking
BOOTP/DHCP off
Zeroconf on
Camera Name <name>
IP Address 10.10.10.3
Network Mask 255.0.0.0
Broadcast 10.255.255.255
Link Local Address 169.254.200.49
DNS Server 10.0.0.9
Statistics Dropped: 0.0% Collisions: 0%
LEC: 0 SEC: 0
* Routing
Default Route Gateway: 10.0.0.1 Connection: Ethernet interface
* ISDN Dial-In
Camera MSN answer calls to every MSN
Security PAP
Login Name guest
Camera IP Address <ip>
* System
Date and Time <date>
Current Uptime <uptime>
* Audio
This is exactly how the data is sent to me, linebreaks and all. I only want the parts within angle brackets. This is the regexp I was using (which isn't working perfectly):
// 1 2 3 4 5 6 7 8 9
$pattern = "/Model (.*)\nFactory IP Address (.*)\n(.*)Software (.*) (.*)Camera Name (.*)\n(.*)Date and Time (.*)\nCurrent Uptime (.*)\n/s";
// 1 = Model #
// 2 = Factory IP
// 3 = Filler
// 4 = Firmware
// 5 = Filler
// 6 = Name
// 7 = Filler
// 8 = Date
// 9 = Uptime
The problem I'm having is that the matches immediately before a special character which are followed by another match (2, 4, 6, and 9) are being excessively greedy and not stopping at the boundary I want them to (2, 6, and 9 should be stopping at the new line \n, and 4 should stop at the space). Matches 1 and 8 work fine.
I'm sure there's something wrong with my pattern, but I'm not versed enough in regexps to spot the issue. Can anyone help?
Thanks in advance.
Posts
The question mark means "non-greedy", which I think is exactly what you need.
$lines[1] =~ /\<(.*)\>/o
$model = $1
(...)
Since the running time for a Perl regex is roughly O(<regex_length> * <text_length>), cutting down on both of those is very helpful.
Using the /o flag means it'll only build the finite state automatia behind the match once - and since it's the same thing for every match, it'll also be a lot more efficient.