Sessionize logs by using the time interval.
Sessionization is a common analytic operation in log analysis. Given an input pageview table, where each row records a webpage visit made by a particular user (or IP address), the sessionization operation identifies user’s web browsing sessions from the recorded visits, by grouping the visits from each user based on the time-intervals between the visits.
Conceptually, if two visis from the same user are made too far apart in time (as defined by a time-out threshold), they will be treated as coming from two browsing sessions.
Treasure Data's Hive engine has TD_SESSIONIZE() function to make you sessionize really easily. The example below are using 1 hour to split the sessions.