• Getting average motorcycle price across all Craigslist cities

    Today I’m going to look at a motorcycle that’s for sale on Craigslist. The asking price for the bike seems fair, but I wanted to get a sense for what other people were asking for the same model and year.

    First I did a local search for the motorcycle I was interested in using the year, make and model search filters. The resultant URL was

    https://philadelphia.craigslist.org/search/mcy?srchType=T&auto_make_model=suzuki+TU250X&min_auto_year=2012&max_auto_year=2012
    

    This returned all the listings in Philadelphia for a 2012 Suzuki TU250X. The srchType=T parameter filters to only include results that have a match in the listing title.

    Using pup, a command-line tool for parsing HTML, I extracted the asking price of the motorcycle in the search result listing.

    curl -s "https://philadelphia.craigslist.org/search/mcy?srchType=T&auto_make_model=suzuki+TU250X&min_auto_year=2012&max_auto_year=2012" | \
    pup 'ul.rows li.result-row p.result-info span.result-meta span.result-price text{}'
    

    There is a CL page that lists every Craigslist site in the US. I parsed that for each location’s specific URL.

    curl -s "https://geo.craigslist.org/iso/us" | \
    pup 'div.geo-site-list-container a attr{href}'
    

    I combined these

    curl -s "https://geo.craigslist.org/iso/us" | \
    pup 'div.geo-site-list-container a attr{href}' | \
    while read location;
     do curl -s "$location/search/mcy?srchType=T&auto_make_model=suzuki+TU250X&min_auto_year=2012&max_auto_year=2012" | \
     pup 'ul.rows li.result-row p.result-info span.result-meta span.result-price text{}';
    done
    

    which outputs the asking prices…

    $3800
    $2750
    $2800
    $2950
    $3800
    $3750
    $2800
    $2400
    $2750
    $2950
    $2750
    $3800
    $3750
    $2700
    $2400
    ...
    

    I was then able to see how the price of the motorcycle in which I was interested compared to similar bikes throughout the US.


  • Resume

    I just updated my resume – the latest version of it should be viewable here.

    For this iteration of resume-writing I decided to try out the open-source JSON Resume schema and tooling. So far it’s OK. My only complaints are:

    • No markdown export support, despite the docs mentioning it
    • PDF exports are pretty gross
    • Can’t export using a local theme, though there’s an open issue about this

    To get a PDF version of my resume I ended up just writing it in markdown and then using pandoc to generate a PDF from that. I’m wondering if just using markdown and pandoc would be a better option in the long term, but we’ll see. It’s nice having a resume under vcs though :)


  • How to set SQL modes for MySQL in a CircleCI container

    In circle.yml you can add comma-separated sql-modes to /etc/mysql/my.cnf and then restart the mysql service, e.g.:

    database:
      pre:
        - echo "[mysqld]" | sudo tee -a /etc/mysql/my.cnf
        - echo 'sql-mode="ALLOW_INVALID_DATES"' | sudo tee -a /etc/mysql/my.cnf
        - sudo service mysql restart
    

  • OS X Setup Guide

    Recently I got a new work laptop. Today I configured it and documented the process. If I make any changes to these instructions I’ll update them in this Gist.

    Update: I recently had to set up a new computer, so I took the opportunity to automate a lot of this stuff: https://github.com/bi1yeu/new-macos

    Here are the programs and tweaks I first made on a fresh install of OS X 10.11:


    Homebrew / Cask

    $ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
    $ brew tap caskroom/cask
    $ brew tap caskroom/versions
    $ brew tap caskroom/fonts
    $ brew tap tldr-pages/tldr
    

    Rearrage PATHs

    This is so that Homebrew-installed tools load instead of built-in ones. git, emacs, etc.

    $ sudo vi /etc/paths
    

    Put /usr/local/bin above /usr/bin in the list. Save file and restart terminal.

    Install with Cask

    $ brew cask install alfred                # https://www.alfredapp.com/
    $ brew cask install flux                  # https://justgetflux.com/
    $ brew cask install font-source-code-pro  # https://github.com/adobe-fonts/source-code-pro
    $ brew cask install google-chrome         # https://www.google.com/chrome/browser/desktop/
    $ brew cask install iterm2-nightly        # http://iterm2.com/
    $ brew cask install java                  # https://java.com/en/download/
    $ brew cask install keepingyouawake       # https://github.com/newmarcel/KeepingYouAwake
    $ brew cask install lunchy                # https://github.com/eddiezane/lunchy
    $ brew cask install middleclick           # http://rouge41.com/labs/
    $ brew cask install osxfuse               # https://osxfuse.github.io/
    $ brew cask install spectacle             # https://www.spectacleapp.com/
    

    Install with Homebrew

    $ brew install aspell                     # http://aspell.net/
    $ brew install git                        # https://git-scm.com/
    $ brew install gnupg2                     # https://www.gnupg.org/
    $ brew install homebrew/fuse/sshfs        # https://github.com/libfuse/sshfs
    $ brew install httpie                     # https://github.com/jkbrzt/httpie
    $ brew install jq                         # https://stedolan.github.io/jq/
    $ brew install leiningen                  # http://leiningen.org/
    $ brew install markdown                   # http://daringfireball.net/projects/markdown/
    $ brew install pandoc                     # http://pandoc.org/
    $ brew install task                       # http://taskwarrior.org/
    $ brew install the_platinum_searcher      # https://github.com/monochromegane/the_platinum_searcher
    $ brew install tldr                       # http://tldr-pages.github.io/
    

    launchctl / lunchy

    build locate database:

    $ lunchy install /System/Library/LaunchDaemons/com.apple.locate.plist
    $ lunchy start com.apple.locate.plist
    

    Create plist to periodically run brew update (replace <username> below):

      <?xml version="1.0" encoding="UTF-8"?>
      <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
      <plist version="1.0">
      <dict>
          <key>Label</key>
          <string>com.<username>.brew-update</string>
          <key>ProgramArguments</key>
          <array>
              <string>/usr/local/bin/brew</string>
              <string>update</string>
          </array>
          <key>StartCalendarInterval</key>
          <dict>
            <key>Hour</key>
            <integer>2</integer>
            <key>Minute</key>
            <integer>0</integer>
          </dict>
          <key>RunAtLoad</key>
          <true />
      </dict>
      </plist>
    

    And start it:

    $ lunchy start ~/Library/LaunchAgents/com.<username>.brew-update.plist
    

    Git configuration

    $ git config --global user.name '<your-proper-name>'
    $ git config --global user.email <email-address>
    $ git config --global core.editor 'vim'
    

    Spacemacs

    http://spacemacs.org/

    $ brew install --with-cocoa emacs
    $ ln -s /usr/local/Cellar/emacs/24.5/Emacs.app /Applications
    $ git clone https://github.com/syl20bnr/spacemacs ~/.emacs.d
    $ git clone https://github.com/bi1yeu/dotfiles
    $ cd ~/dotfiles/ && ./link-dotfiles.sh
    $ emacs --daemon && emacsclient -c &
    

    Bash prompt

    $ brew install bash-git-prompt
    

    Add to .bash_profile:

    if [ -f "$(brew --prefix)/opt/bash-git-prompt/share/gitprompt.sh" ]; then
        source "$(brew --prefix)/opt/bash-git-prompt/share/gitprompt.sh"
    fi
    
    GIT_PROMPT_ONLY_IN_REPO=1
    

    Prefs

    • Map Caps Lock key to Control modifier in Keyboard prefs
    • Adjust key repeat in Keyboard prefs
    • Enable Remote Login in Sharing prefs
    • Prevent computer from sleeping automatically when the display is off in Energy Saver prefs
    • Put display to sleep in Mission Control > Hot Corners prefs

    Generate SSH key and add it to ssh-agent

    $ ssh-keygen -t rsa -b 4096 -C "<email-address>"
    $ chmod 700 ~/.ssh && chmod 600 ~/.ssh/*
    $ eval "$(ssh-agent -s)"
    $ ssh-add ~/.ssh/id_rsa
    

    Edit iTerm2 settings to bind alt+arrow keys to move forward/backward

    https://coderwall.com/p/h6yfda/use-and-to-jump-forwards-backwards-words-in-iterm-2-on-os-x


  • Auto-incremented values can appear out of order if inserted in transactions

    Last week I was surprised to learn that auto-incrementing field values in MySQL don't always necessarily appear in sorted order to clients. That is, it's possible for a count of rows in a table to increase while the max of an auto-inc col remains unchanged. Auto-incremented values are created when the record is inserted, not necessarily when the transaction during which the record is inserted is committed.

    One of our systems at work can replicate updated rows from a table given that a column in it contains strictly incrementing values denoting the order of updates. I.e., if our system observes a row where the replication column has value X, subsequent replication jobs will only query for rows with a replication column value >= X. This is useful for efficiently replicating tables with a modified_at column or an append-only table with an incrementing ID. Most of the time...

    We have a user who was inserting rows very rapidly into one such source database. They were using an AUTO_INCREMENTing column as a replication key for our system. In general there's no issue with doing this. We found, however, that in this case, rows were being inserted in batches as part of transactions happening parallel.

    I discovered that auto-incrementing values are created at insertion time, not at commit time. In the case of relatively long-running transactions that happen simultaneously, this can result in auto-inc field values being visible in an unpredictable order.

    It's fairly easy to illustrate this problem. Here I use two MySQL connections to show what can happen when there are multiple transactions inserting into a table simultaneously.

    First, create a table with an auto-inc field and insert a row.

    mysql> CREATE TABLE animals (
        ->      id MEDIUMINT NOT NULL AUTO_INCREMENT,
        ->      name CHAR(30) NOT NULL,
        ->      PRIMARY KEY (id)
        -> );
    Query OK, 0 rows affected (0.03 sec)
    
    mysql> insert into animals (name) values ('aardvark');
    Query OK, 1 row affected (0.02 sec)
    

    Observe that a simple SELECT query returns the row, as expected. The id field starts at 1.

    mysql> select * from animals;
    +----+----------+
    | id | name     |
    +----+----------+
    |  1 | aardvark |
    +----+----------+
    1 row in set (0.00 sec)
    

    Now, start a new transaction and insert some more records:

    mysql> start transaction ;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> insert into animals (name) values ('bear'), ('cow'), ('dinosaur');
    Query OK, 3 rows affected (0.00 sec)
    Records: 3  Duplicates: 0  Warnings: 0
    

    In another terminal, connect to the database and select from the table. You'll see only the original row is visible. The three records inserted during the transaction have not yet been committed.

    mysql> select * from animals;
    +----+----------+
    | id | name     |
    +----+----------+
    |  1 | aardvark |
    +----+----------+
    1 row in set (0.00 sec)
    

    Now, still in the second terminal, start a separate transaction and insert three more records. This time, though, commit the transaction changes.

    If you query the database after that commit, you'll see that the three most recently committed rows are visible and are returned by the query. Also note, importantly, that auto-incrementing id field values have incremented past the values that were inserted but not yet committed.

    mysql> start transaction;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> insert into animals (name) values ('elephant'), ('frog'), ('groundhog');
    Query OK, 3 rows affected (0.00 sec)
    Records: 3  Duplicates: 0  Warnings: 0
    
    mysql> commit;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> select * from animals;
    +----+-----------+
    | id | name      |
    +----+-----------+
    |  1 | aardvark  |
    |  5 | elephant  |
    |  6 | frog      |
    |  7 | groundhog |
    +----+-----------+
    4 rows in set (0.00 sec)
    

    Back in the first terminal, you can commit the first transaction. Now all the records will be returned by a SELECT.

    mysql> commit;
    Query OK, 0 rows affected (0.01 sec)
    
    mysql> select * from animals;
    +----+-----------+
    | id | name      |
    +----+-----------+
    |  1 | aardvark  |
    |  2 | bear      |
    |  3 | cow       |
    |  4 | dinosaur  |
    |  5 | elephant  |
    |  6 | frog      |
    |  7 | groundhog |
    +----+-----------+
    7 rows in set (0.00 sec)
    

    I think the moral of the story is: don't rely on the order in which values in an AUTO_INCREMENT column appear. There could be rows in transactional flight with smaller values than the max of the auto-inc column.