Diff

It is possible to see what are the changes between any two commits using the command wrgl diff. It can either show the changes in the built-in difftool, or output the changes to a CSV.

The naming conventions are as follow: The first argument to wrgl diff is called the "newer" commit; Whereas the second argument (whether specified or not) is called the "older" commit. Therefore when things exist in the first commit but not the second, they are called "additions". Likewise, when things exist in the second but not the first, they are called "removals". However this does not imply a chronological relationship between the two commits.

wrgl diff

Show changes between two commits

wrgl diff { COMMIT | COMMIT_OR_FILE COMMIT_OR_FILE | BRANCH --branch-file | --all | --txid TRANSACTION_ID } [flags]

A commit can be specified using shorten sum, full sum, or a reference name. If only one commit is specified, it will be compared with a parent commit. It is also possible to compare to a local CSV file instead of a commit, in which case both arguments must be given and --primary-key should also be specified.

By defaults, this command shows diff with a command-line UI, but it's possible to output to a CSV file by setting --no-gui.

Flags

--all

show diff summary for all branches that have branch.file configured. This flag is automatically set when no argument is given and --txid is not set

--branch-file

if only one argument is given and it is a branch name, compare against branch.file (if it is configured with wrgl commit --set-file)

--delimiter-1

CSV delimiter of the first argument if the first argument is an external file. Defaults to comma.

--delimiter-2

CSV delimiter of the second argument if the second argument is an external file. Defaults to comma.

-h, --help

help for diff

--mem-limit

limit memory consumption (in bytes). If not set then memory limit is automatically calculated.

--no-gui

don't show the diff table, instead output changes to file DIFF_SUM1_SUM2.csv

-n, --num-workers

number of CPU threads to utilize

-p, --primary-key

field names to be used as primary key (only applicable if diff target is a file)

--txid

show diff summary for all changes with specified transaction id

Inherited flags

--badger-log

set Badger log level, valid options are "error", "warning", "debug", and "info" (defaults to "error")

--cpuprofile

write cpu profile to file

--heapprofile

write heap profile to file

--log-file

output logs to specified file

--log-verbosity

log verbosity. Higher value means more log

--no-progress

don't display progress bar

--wrgl-dir

parent directory of repo, default to current working directory.

Examples

# show changes compared to the previous commit
wrgl diff 1a2ed62

# don't show the interactive table, output to a CSV file instead
wrgl diff 1a2ed62 --no-gui

# show changes between branches
wrgl diff branch-1 branch-2

# show changes between commits
wrgl diff 1a2ed6248c7243cdaaecb98ac12213a7 f1cf51efa2c1e22843b0e083efd89792

# show changes between files
wrgl diff file-1.csv file-2.csv --primary-key id,name

# show changes between a file and the head commit from a branch
wrgl diff my-file.csv my-branch

# show diff between branch.file config (set with wrgl commit --set-file) and the latest commit of a branch
wrgl diff my-branch --branch-file

# show diff summary for branches that have branch.file configured
wrgl diff --all

# show diff summary for all changes made with a transaction (run 'wrgl transaction -h' to learn more about transaction)
wrgl diff --txid a1dbfcc4-f6da-454c-a783-f1b70d347baf

View changes with the difftool

Running wrgl diff without --no-gui flag displays the built-in difftool.

Diff UI screenshot

Diff UI screenshot

UI elements

From top to bottom:

  • Commit names and sums in the format <first commit> vs <second commit>
  • One or more tabs, each written in the format (<keyboard shortcut>) <name>. Each can be activated with either mouse click or keyboard shortcut.
  • The diff table which can be interacted with using mouse and keyboard. It has the following color conventions:
    • Teal column: primary key column.
    • Green column: added column.
    • Red column: removed column.
    • Green/red cell: modified cell. The green part is new value, whereas the red part is old value.
  • Additional keyboard shortcuts.

Keyboard shortcuts

  • q: Activate the first tab.
  • w: Activate the second tab.
  • e: Activate the third tab.
  • arrow keys: Navigate inside the diff table.
  • g: Scroll to begin of table.
  • Shift+g: Scroll to end of table.
  • h: Navigate left one cell.
  • j: Navigate down one cell.
  • k: Navigate up one cell.
  • l: Navigate right one cell.

Output changes to CSV

Running wrgl diff with --no-gui flag creates a CSV file with nameDIFF_<sum1>_<sum2>.csv. The first column in this file is the row label which aid with reviewing. The first four rows are special:

  1. Column names of the second commit. Removed columns are empty strings.
  2. Column names of the first commit. Added columns are empty strings.
  3. Primary key of the second commit. If column is primary key then cell has the value "true", empty string otherwise.
  4. Primary key of the first commit. If column is primary key then cell has the value "true", empty string otherwise.

The rest of the CSV show the changes in three ways:

  • If the row was added (exist in the first commit only), output the row from the first commit with row label ADDED IN <first commit>.
  • If the row was removed (exist in the second commit only), output the row from the second commit with row label REMOVED IN <first commit>.
  • If the row was modified (exist in both commits), output the row from the second commit with row label BASE ROW FROM <second commit>, then output the row from the first commit with row label MODIFIED IN <first commit>.

Example

Given the below CSVs (note that main^ means the parent commit of the main branch):

a,b,c
1,q,w
2,a,s
3,z,x

main (25a6dda)


a,b,c
1,q,e
2,a,s
4,s,d

main^ (72718e7)

Running this command:

wrgl diff main main^ --no-gui

Produce this file:

COLUMNS IN main^ (72718e7),a,b,c
COLUMNS IN main (25a6dda),a,b,c
PRIMARY KEY IN main^ (72718e7),true,,
PRIMARY KEY IN main (25a6dda),true,,
BASE ROW FROM main^ (72718e7),1,q,w
MODIFIED IN main (25a6dda),1,q,e
ADDED IN main (25a6dda),4,s,d
REMOVED IN main (25a6dda),3,z,x

DIFF_25a6dda_72718e7.csv