Extracting and archiving my #music-we-like Slack chats

At work, we have a #music-we-like Slack channel, and it’s full of gems. I thought I’d see how hard it would be to extract my messages, and then archive them to a page on this blog.

Pretty straightforward as it turns out, thanks to Seb Seager’s slack-exporter.

I followed the setup instructions, got an oauth token and could slurp down the messages from the channel.

Because I only want to archive and post my own messages, I was hoping there’d be a user filter option as well as the --ch channel one. There isn’t, but no worries, as there’s also the --json option. I made a quick PR so the JSON it outputs is valid (objects were the single-quoted python data structure and thus not valid JSON) and now I can pipe to jq to extract just my messages:

python3 ~/src/slack-exporter/exporter.py --json -c --ch CHANNELID | jq '[.[] | select(.user=="USERID") | {id: .client_msg_id, timestamp: .ts, text: .text, attachments: .attachments}]' | sed -E 's,<(https?://[^>]*)>,<a href=\\"\1\\">\1</a>,g' > ~/src/blog/_data/music-i-like.json

The sed portion of the pipe converts the urls in Slack messages that look like <https://example.com> into Markdown links [https://example.com](https://example.com).

Then we need a markdown page to render the _data/music-i-like.json file:

---
layout: page
title: Music I've liked
permalink: /music-i-like/
---

{% for message in site.data.music-i-like %}
  <article>
      {{ message.text }}

      {% for attachment in message.attachments %}
          {{ attachment.audio_html }}
      {% endfor %}
  </article>
{% endfor %}

This works, but the page is huge. Jekyll does paginate, but only index.html pages. There’s jekyll-paginate-v2, but this doesn’t support paginating data pages.

Time to for another PR. We can turn pagination on for a single page in the frontmatter like this:

pagination:
  enabled: true
  # data: music-i-like

That commented-out last line is how I was initially thinking about triggering the data pagination, specifying what site.data to use, but this is already implicit in your page (which is named the same as your data file), so it felt unnecessary. Adding this to a data page should be enough to trigger pagination:

pagination:
  enabled: true

You can use all the other features of the jekyll-paginate-v2 gem, e.g. per_page:, trail:, limit:, sort_reverse:, and so on. Sorting is limited to date, as the data entries are coerced into fake posts with no other attributes.

Your markdown page needs some changes so the paginated content is displayed correctly:

<!-- Slice the data array to match the pagination. data_start sets the offset
     of the data array and this_page_data is the data for the page -->
{% assign data_start = page.pagination_info.curr_page | minus: 1 | times: paginator.per_page %}

{% assign this_page_data = site.data.music-i-like | slice: data_start, paginator.per_page %}

<!-- loop over the data that's just for the paginated page -->
{% for message in this_page_data %}
  <article id="{{ message.id}}">
      <time datetime="{{message.timestamp | floor | date: "%b %d %Y"}}">{{message.timestamp | floor | date: "%b %d %Y"}}</time>
      {{ message.text }}

      {% for attachment in message.attachments %}
          {{ attachment.audio_html }}
      {% endfor %}
  </article>
  <hr>

{% endfor %}