Science and technology

Improve your time administration with Jupyter

Python has extremely scalable choices for exploring knowledge. With Pandas or Dask, you possibly can scale Jupyter as much as huge knowledge. But what about small knowledge? Personal knowledge? Private knowledge?

JupyterLab and Jupyter Notebook present an ideal surroundings to scrutinize my laptop-based life.

My exploration is powered by the truth that nearly each service I exploit has an internet utility programming interface (API). I exploit many such companies: a to-do listing, a time tracker, a behavior tracker, and extra. But there’s one that nearly everybody makes use of: a calendar. The identical concepts could be utilized to different companies, however calendars have one cool characteristic: an open commonplace that the majority internet calendars assist: CalDAV.

Parsing your calendar with Python in Jupyter

Most calendars present a method to export into the CalDAV format. You may have some authentication for accessing this non-public knowledge. Following your service’s directions ought to do the trick. How you get the credentials relies on your service, however ultimately, it is best to have the ability to retailer them in a file. I retailer mine in my root listing in a file known as .caldav:

import os
with open(os.path.expanduser("~/.caldav")) as fpin:
    username, password = fpin.learn().cut up()

Never put usernames and passwords immediately in notebooks! They may simply leak with a stray git push.

The subsequent step is to make use of the handy PyPI caldav library. I regarded up the CalDAV server for my electronic mail service (yours could also be totally different):

import caldav
shopper = caldav.DAVClient(url="https://caldav.fastmail.com/dav/", username=username, password=password)

CalDAV has an idea known as the principal. It is just not essential to get into proper now, besides to know it is the factor you utilize to entry the calendars:

principal = shopper.principal()
calendars = principal.calendars()

Calendars are, actually, all about time. Before accessing occasions, you might want to determine on a time vary. One week must be a superb default:

from dateutil import tz
import datetime
now = datetime.datetime.now(tz.tzutc())
since = now - datetime.timedelta(days=7)

Most individuals use multiple calendar, and most of the people need all their occasions collectively. The itertools.chain.from_iterable makes this simple:

import itertools

raw_events = listing(
    itertools.chain.from_iterable(
        calendar.date_search(begin=since, finish=now, develop=True)
        for calendar in calendars
    )
)

Reading all of the occasions into reminiscence is essential, and doing so within the API’s uncooked, native format is a vital observe. This implies that when fine-tuning the parsing, analyzing, and displaying code, there isn’t a want to return to the API service to refresh the information.

But “raw” is just not an understatement. The occasions come via as strings in a particular format:

print(raw_events[12].knowledge)

    BEGIN:VCALENDAR
    VERSION:2.Zero
    PRODID:-//CyrusIMAP.org/Cyrus
     three.three.Zero-232-g4bdb081-fm-20200825.002-g4bdb081a//EN
    BEGIN:VEVENT
    DTEND:20200825T230000Z
    DTSTAMP:20200825T181915Z
    DTSTART:20200825T220000Z
    SUMMARY:Busy
    UID:
     1302728i-040000008200E00074C5B7101A82E00800000000D939773EA578D601000000000
     000000010000000CD71CC3393651B419E9458134FE840F5
    END:VEVENT
    END:VCALENDAR

Luckily, PyPI involves the rescue once more with one other helper library, vobject:

import io
import vobject

def parse_event(raw_event):
    knowledge = raw_event.knowledge
    parsed = vobject.learnOne(io.StringIO(knowledge))
    contents = parsed.vevent.contents
    return contents

parse_event(raw_events[12])

    'dtend': [<DTEND2020-08-25 23:00:00+00:00>],
     'dtstamp': [<DTSTAMP2020-08-25 18:19:15+00:00>],
     'dtstart': [<DTSTART2020-08-25 22:00:00+00:00>],
     'abstract': [<SUMMARYBusy>],
     'uid': [<UID1302728i-040000008200E00074C5B7101A82E00800000000D939773EA578D601000000000000000010000000CD71CC3393651B419E9458134FE840F5>]

Well, at the very least it is slightly higher.

There continues to be some work to do to transform it to an inexpensive Python object. The first step is to have an inexpensive Python object. The attrs library offers a pleasant begin:

import attr
from __future__ import annotations
@attr.s(auto_attribs=True, frozen=True)
class Event:
    begin: datetime.datetime
    finish: datetime.datetime
    timezone: Any
    abstract: str

Time to jot down the conversion code!

The first abstraction will get the worth from the parsed dictionary with out all of the decorations:

def get_piece(contents, title):
    return contents[title][Zero].worth

get_piece(_, "dtstart")
    datetime.datetime(2020, eight, 25, 22, Zero, tzinfo=tzutc())

Calendar occasions at all times have a begin, however they generally have an “end” and typically a “duration.” Some cautious parsing logic can harmonize each into the identical Python objects:

def from_calendar_event_and_timezone(occasion, timezone):
    contents = parse_event(occasion)
    begin = get_piece(contents, "dtstart")
    abstract = get_piece(contents, "summary")
    strive:
        finish = get_piece(contents, "dtend")
    besides KeyError:
        finish = begin + get_piece(contents, "duration")
    return Event(begin=begin, finish=finish, abstract=abstract, timezone=timezone)

Since it’s helpful to have the occasions in your native time zone reasonably than UTC, this makes use of the native timezone:

my_timezone = tz.gettz()
from_calendar_event_and_timezone(raw_events[12], my_timezone)
    Event(begin=datetime.datetime(2020, eight, 25, 22, Zero, tzinfo=tzutc()), finish=datetime.datetime(2020, eight, 25, 23, Zero, tzinfo=tzutc()), timezone=tzfile('/and so forth/localtime'), abstract='Busy')

Now that the occasions are actual Python objects, they actually ought to have some further data. Luckily, it’s potential so as to add strategies retroactively to courses.

But figuring which day an occasion occurs is just not that apparent. You want the day within the native timezone:

def day(self):
    offset = self.timezone.utcoffset(self.begin)
    mounted = self.begin + offset
    return mounted.date()
Event.day = property(day)

print(_.day)
    2020-08-25

Events are at all times represented internally as begin/finish, however figuring out the period is a helpful property. Duration can be added to the present class:

def period(self):
    return self.finish - self.begin
Event.period = property(period)

print(_.period)
    1:00:00

Now it’s time to convert all occasions into helpful Python objects:

all_events = [from_calendar_event_and_timezone(raw_event, my_timezone)
              for raw_event in raw_events]

All-day occasions are a particular case and possibly much less helpful for analyzing life. For now, you possibly can ignore them:

# ignore all-day occasions
all_events = [occasion for occasion in all_events if not kind(occasion.begin) == datetime.date]

Events have a pure order—figuring out which one occurred first might be helpful for evaluation:

all_events.kind(key=lambda ev: ev.begin)

Now that the occasions are sorted, they are often damaged into days:

import collections
events_by_day = collections.defaultdict(listing)
for occasion in all_events:
    events_by_day[occasion.day].append(occasion)

And with that, you will have calendar occasions with dates, period, and sequence as Python objects.

Reporting in your life in Python

Now it’s time to write reporting code! It is enjoyable to have eye-popping formatting with correct headers, lists, essential issues in daring, and so forth.

This means HTML and a few HTML templating. I like to make use of Chameleon:

template_content = """
<html><physique>
<div tal:repeat="merchandise gadgets">
<h2 tal:content material="merchandise[0]">Day</h2>
<ul>
    <li tal:repeat="occasion merchandise[1]"><span tal:replace="occasion">Thing</span></li>
</ul>
</div>
</physique></html>"""

One cool characteristic of Chameleon is that it’s going to render objects utilizing its html methodology. I’ll use it in two methods:

  • The abstract shall be in daring
  • For most occasions, I’ll take away the abstract (since that is my private data)

def __html__(self):
    offset = my_timezone.utcoffset(self.begin)
    mounted = self.begin + offset
    start_str = str(mounted).cut up("+")[Zero]
    abstract = self.abstract
    if abstract != "Busy":
        abstract = "&lt;REDACTED&gt;"
    return f"<b>summary[:30]</b> -- (self.duration)"
Event.__html__ = __html__

In the curiosity of brevity, the report shall be sliced into sooner or later’s price.

import chameleon
from IPython.show import HTML
template = chameleon.PageTemplate(template_content)
html = template(gadgets=itertools.islice(events_by_day.gadgets(), three, four))
HTML(html)

When rendered, it should look one thing like this:

2020-08-25

  • <REDACTED> — 2020-08-25 08:30:00 (Zero:45:00)
  • <REDACTED> — 2020-08-25 10:00:00 (1:00:00)
  • <REDACTED> — 2020-08-25 11:30:00 (Zero:30:00)
  • <REDACTED> — 2020-08-25 13:00:00 (Zero:25:00)
  • Busy — 2020-08-25 15:00:00 (1:00:00)
  • <REDACTED> — 2020-08-25 15:00:00 (1:00:00)
  • <REDACTED> — 2020-08-25 19:00:00 (1:00:00)
  • <REDACTED> — 2020-08-25 19:00:12 (1:00:00)

Endless choices with Python and Jupyter

This solely scratches the floor of what you are able to do by parsing, analyzing, and reporting on the information that varied internet companies have on you.

Why not strive it along with your favourite service?

Most Popular

To Top