-
Apr 18, 2017
Getting average motorcycle price across all Craigslist cities
Today I’m going to look at a motorcycle that’s for sale on Craigslist. The asking price for the bike seems fair, but I wanted to get a sense for what other people were asking for the same model and year.
First I did a local search for the motorcycle I was interested in using the year, make and model search filters. The resultant URL was
https://philadelphia.craigslist.org/search/mcy?srchType=T&auto_make_model=suzuki+TU250X&min_auto_year=2012&max_auto_year=2012
This returned all the listings in Philadelphia for a 2012 Suzuki TU250X. The srchType=T parameter filters to only include results that have a match in the listing title.
Using pup, a command-line tool for parsing HTML, I extracted the asking price of the motorcycle in the search result listing.
curl -s "https://philadelphia.craigslist.org/search/mcy?srchType=T&auto_make_model=suzuki+TU250X&min_auto_year=2012&max_auto_year=2012" | \ pup 'ul.rows li.result-row p.result-info span.result-meta span.result-price text{}'
There is a CL page that lists every Craigslist site in the US. I parsed that for each location’s specific URL.
curl -s "https://geo.craigslist.org/iso/us" | \ pup 'div.geo-site-list-container a attr{href}'
I combined these
curl -s "https://geo.craigslist.org/iso/us" | \ pup 'div.geo-site-list-container a attr{href}' | \ while read location; do curl -s "$location/search/mcy?srchType=T&auto_make_model=suzuki+TU250X&min_auto_year=2012&max_auto_year=2012" | \ pup 'ul.rows li.result-row p.result-info span.result-meta span.result-price text{}'; done
which outputs the asking prices…
$3800 $2750 $2800 $2950 $3800 $3750 $2800 $2400 $2750 $2950 $2750 $3800 $3750 $2700 $2400 ...
I was then able to see how the price of the motorcycle in which I was interested compared to similar bikes throughout the US.
-
Jan 2, 2017
Resume
I just updated my resume – the latest version of it should be viewable here.
For this iteration of resume-writing I decided to try out the open-source JSON Resume schema and tooling. So far it’s OK. My only complaints are:
- No markdown export support, despite the docs mentioning it
- PDF exports are pretty gross
- Can’t export using a local theme, though there’s an open issue about this
To get a PDF version of my resume I ended up just writing it in markdown and then using pandoc to generate a PDF from that. I’m wondering if just using markdown and pandoc would be a better option in the long term, but we’ll see. It’s nice having a resume under vcs though :)
-
Nov 8, 2016
How to set SQL modes for MySQL in a CircleCI container
In
circle.yml
you can add comma-separatedsql-mode
s to/etc/mysql/my.cnf
and then restart themysql
service, e.g.:database: pre: - echo "[mysqld]" | sudo tee -a /etc/mysql/my.cnf - echo 'sql-mode="ALLOW_INVALID_DATES"' | sudo tee -a /etc/mysql/my.cnf - sudo service mysql restart
-
Aug 29, 2016
OS X Setup Guide
Recently I got a new work laptop. Today I configured it and documented the process. If I make any changes to these instructions I’ll update them in this Gist.
Update: I recently had to set up a new computer, so I took the opportunity to automate a lot of this stuff: https://github.com/bi1yeu/new-macos
Here are the programs and tweaks I first made on a fresh install of OS X 10.11:
Homebrew / Cask
$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" $ brew tap caskroom/cask $ brew tap caskroom/versions $ brew tap caskroom/fonts $ brew tap tldr-pages/tldr
Rearrage PATHs
This is so that Homebrew-installed tools load instead of built-in ones. git, emacs, etc.
$ sudo vi /etc/paths
Put
/usr/local/bin
above/usr/bin
in the list. Save file and restart terminal.Install with Cask
$ brew cask install alfred # https://www.alfredapp.com/ $ brew cask install flux # https://justgetflux.com/ $ brew cask install font-source-code-pro # https://github.com/adobe-fonts/source-code-pro $ brew cask install google-chrome # https://www.google.com/chrome/browser/desktop/ $ brew cask install iterm2-nightly # http://iterm2.com/ $ brew cask install java # https://java.com/en/download/ $ brew cask install keepingyouawake # https://github.com/newmarcel/KeepingYouAwake $ brew cask install lunchy # https://github.com/eddiezane/lunchy $ brew cask install middleclick # http://rouge41.com/labs/ $ brew cask install osxfuse # https://osxfuse.github.io/ $ brew cask install spectacle # https://www.spectacleapp.com/
Install with Homebrew
$ brew install aspell # http://aspell.net/ $ brew install git # https://git-scm.com/ $ brew install gnupg2 # https://www.gnupg.org/ $ brew install homebrew/fuse/sshfs # https://github.com/libfuse/sshfs $ brew install httpie # https://github.com/jkbrzt/httpie $ brew install jq # https://stedolan.github.io/jq/ $ brew install leiningen # http://leiningen.org/ $ brew install markdown # http://daringfireball.net/projects/markdown/ $ brew install pandoc # http://pandoc.org/ $ brew install task # http://taskwarrior.org/ $ brew install the_platinum_searcher # https://github.com/monochromegane/the_platinum_searcher $ brew install tldr # http://tldr-pages.github.io/
launchctl / lunchy
build
locate
database:$ lunchy install /System/Library/LaunchDaemons/com.apple.locate.plist $ lunchy start com.apple.locate.plist
Create plist to periodically run
brew update
(replace<username>
below):<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>com.<username>.brew-update</string> <key>ProgramArguments</key> <array> <string>/usr/local/bin/brew</string> <string>update</string> </array> <key>StartCalendarInterval</key> <dict> <key>Hour</key> <integer>2</integer> <key>Minute</key> <integer>0</integer> </dict> <key>RunAtLoad</key> <true /> </dict> </plist>
And start it:
$ lunchy start ~/Library/LaunchAgents/com.<username>.brew-update.plist
Git configuration
$ git config --global user.name '<your-proper-name>' $ git config --global user.email <email-address> $ git config --global core.editor 'vim'
Spacemacs
http://spacemacs.org/
$ brew install --with-cocoa emacs $ ln -s /usr/local/Cellar/emacs/24.5/Emacs.app /Applications $ git clone https://github.com/syl20bnr/spacemacs ~/.emacs.d $ git clone https://github.com/bi1yeu/dotfiles $ cd ~/dotfiles/ && ./link-dotfiles.sh $ emacs --daemon && emacsclient -c &
Bash prompt
$ brew install bash-git-prompt
Add to
.bash_profile
:if [ -f "$(brew --prefix)/opt/bash-git-prompt/share/gitprompt.sh" ]; then source "$(brew --prefix)/opt/bash-git-prompt/share/gitprompt.sh" fi GIT_PROMPT_ONLY_IN_REPO=1
Link Google Account to iCal
Prefs
- Map Caps Lock key to Control modifier in Keyboard prefs
- Adjust key repeat in Keyboard prefs
- Enable Remote Login in Sharing prefs
- Prevent computer from sleeping automatically when the display is off in Energy Saver prefs
- Put display to sleep in Mission Control > Hot Corners prefs
Generate SSH key and add it to
ssh-agent
$ ssh-keygen -t rsa -b 4096 -C "<email-address>" $ chmod 700 ~/.ssh && chmod 600 ~/.ssh/* $ eval "$(ssh-agent -s)" $ ssh-add ~/.ssh/id_rsa
Edit iTerm2 settings to bind alt+arrow keys to move forward/backward
https://coderwall.com/p/h6yfda/use-and-to-jump-forwards-backwards-words-in-iterm-2-on-os-x
-
Aug 23, 2016
Auto-incremented values can appear out of order if inserted in transactions
Last week I was surprised to learn that auto-incrementing field values in MySQL don't always necessarily appear in sorted order to clients. That is, it's possible for a count of rows in a table to increase while the max of an auto-inc col remains unchanged. Auto-incremented values are created when the record is inserted, not necessarily when the transaction during which the record is inserted is committed.
One of our systems at work can replicate updated rows from a table given that a column in it contains strictly incrementing values denoting the order of updates. I.e., if our system observes a row where the replication column has value
X
, subsequent replication jobs will only query for rows with a replication column value>= X
. This is useful for efficiently replicating tables with amodified_at
column or an append-only table with an incrementing ID. Most of the time...We have a user who was inserting rows very rapidly into one such source database. They were using an
AUTO_INCREMENT
ing column as a replication key for our system. In general there's no issue with doing this. We found, however, that in this case, rows were being inserted in batches as part of transactions happening parallel.I discovered that auto-incrementing values are created at insertion time, not at commit time. In the case of relatively long-running transactions that happen simultaneously, this can result in auto-inc field values being visible in an unpredictable order.
It's fairly easy to illustrate this problem. Here I use two MySQL connections to show what can happen when there are multiple transactions inserting into a table simultaneously.
First, create a table with an auto-inc field and insert a row.
mysql> CREATE TABLE animals ( -> id MEDIUMINT NOT NULL AUTO_INCREMENT, -> name CHAR(30) NOT NULL, -> PRIMARY KEY (id) -> ); Query OK, 0 rows affected (0.03 sec) mysql> insert into animals (name) values ('aardvark'); Query OK, 1 row affected (0.02 sec)
Observe that a simple
SELECT
query returns the row, as expected. Theid
field starts at 1.mysql> select * from animals; +----+----------+ | id | name | +----+----------+ | 1 | aardvark | +----+----------+ 1 row in set (0.00 sec)
Now, start a new transaction and insert some more records:
mysql> start transaction ; Query OK, 0 rows affected (0.00 sec) mysql> insert into animals (name) values ('bear'), ('cow'), ('dinosaur'); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0
In another terminal, connect to the database and select from the table. You'll see only the original row is visible. The three records inserted during the transaction have not yet been committed.
mysql> select * from animals; +----+----------+ | id | name | +----+----------+ | 1 | aardvark | +----+----------+ 1 row in set (0.00 sec)
Now, still in the second terminal, start a separate transaction and insert three more records. This time, though, commit the transaction changes.
If you query the database after that commit, you'll see that the three most recently committed rows are visible and are returned by the query. Also note, importantly, that auto-incrementing
id
field values have incremented past the values that were inserted but not yet committed.mysql> start transaction; Query OK, 0 rows affected (0.00 sec) mysql> insert into animals (name) values ('elephant'), ('frog'), ('groundhog'); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.00 sec) mysql> select * from animals; +----+-----------+ | id | name | +----+-----------+ | 1 | aardvark | | 5 | elephant | | 6 | frog | | 7 | groundhog | +----+-----------+ 4 rows in set (0.00 sec)
Back in the first terminal, you can commit the first transaction. Now all the records will be returned by a
SELECT
.mysql> commit; Query OK, 0 rows affected (0.01 sec) mysql> select * from animals; +----+-----------+ | id | name | +----+-----------+ | 1 | aardvark | | 2 | bear | | 3 | cow | | 4 | dinosaur | | 5 | elephant | | 6 | frog | | 7 | groundhog | +----+-----------+ 7 rows in set (0.00 sec)
I think the moral of the story is: don't rely on the order in which values in an
AUTO_INCREMENT
column appear. There could be rows in transactional flight with smaller values than the max of the auto-inc column.
Recent Posts
- Agentic git bisect
- MacWhisper
- Coding with agent helper scripts
- Vibecoding’s allure: the inventor and the fiend
- Philadelphia Data Explorer
- Plant identification
- Let’s Flip an Unfair Coin
- Flight Focus
- Generative AI Accelerates Our Exploration of the Search Space
- AI Assistant Use-Case: Performance Feedback