Keep Looking Up...
"That’s the secret to life" as Snoopy says. And in Splunk, lookup tables are the secret to data enrichment.
Lookup Basics
For those of you who may be new to Splunk, lookups are tables that allow you to enhance your data. You can create fields to add to your events from a Python command or CSV file. Lookups are located in the lookups directory in the app ($SPLUNK_HOME/etc/apps/<appname>/lookups).
Let’s take for example web server logs. Say you want an HTTP status code of 202 to appear in a new field called “http_description” as “Success”. The CSV file for this lookup table would look something like this:
http_status_code,http_status_description
202,Success
The following stanza would be added to your transforms.conf file:
[http_status_description_lookup]
filename = http_status_description.csv
Finally, a lookup statement would be added to the props.conf file to do an automatic lookup
[http_status_description]
LOOKUP-http = http_status_description.csv userid AS myuserid OUTPUT username AS myusername
Temporal Lookups
But what if you need a time-based lookup? For example, you have a dashboard with a form that allows employees to submit data on a daily basis. Once you submit, the data will actually be saved into a lookup table, including a timestamp. Your lookup table could look something like:
ip,timestamp,data
95.177.23.172,10/3/2014 08:30:45,network_security
56.48.75.240,10/3/2014 09:14:32,operations
39.153.79.50,10/3/2014 10:02:53,network_security
231.238.251.138,10/3/2014 12:24:17,sales
Therefore, if you wanted to see all data input for the past 24 hours, 90 days, etc., you can search your lookup for that time range. The search will return all data inputs that occurred.
To do this, add the following stanza to your transforms.conf:
[Employee_Input]
filename = employee_input_data.csv
time_field = timestamp
time_format = %d/%m/%y %H:%M:%S
*This time_format is strptime format; time_format otherwise defaults to epoch time
Additionally, if you wanted to be more lenient in time frames, you can set the max_offset_secs. This property will even match events occurring outside the exact time frame you searched by the max number of seconds you set. The max_offset_data defaults to 2000000000 (two billion). There is also a min_offset_secs, which defaults to 0.
The setting of:
max_offset_secs = 120
This setting would allow for your search to return events up to 2 minutes outside of your time search.
If your setting is:
max_offset_secs = 0
This setting would only return events that happened within your exact timestamp listed in your search.
For more information on lookup tables and how to set them up, visit http://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf and http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Aboutlookupsandfieldactions
Happy Splunking!
- Log in to post comments