Saving Tools

Saving Tools

This page discusses numerous tools that can significantly improve process of saving & loading files, always in a scientific context.

These tools are also used in the examples demonstrated in the Real World Examples page. After reading the proper documentation here it might be worth it to have a look there as well!

We use `FileIO`

For saving and loading files we use FileIO.save and FileIO.load. This means that you have to install yourself whatever saving backend you want to use. FileIO by itself does not install a package that saves data, it only provides the interface!

In addition, DrWatson re-exports FileIO.save and FileIO.load for convenience!

We always call `mkpath`

All functions of DrWatson that save things, e.g. tagsave, safesave, etc. always call mkpath first on the directory the file needs to be saved at.

Safely saving data

Almost all packages that save data by default overwrite existing files (if given a save name of an existing file). This is the default behavior because often it is desired.

Sometimes it is not though! And the consequences of overwritten data can range from irrelevant to catastrophic. To avoid such an event we provide an alternative way to save data that will never overwrite existing files:

DrWatson.safesaveFunction.
safesave(filename, data)

Safely save data in filename by ensuring that no existing files are overwritten. Do this by renaming already existing data with a backup-number ending like #1, #2, .... For example if filename = test.bson, the first time you safesave it, the file is saved normally. The second time the existing save is renamed to test_#1.bson and a new file test.bson is then saved.

If a backup file already exists then its backup-number is incremented (e.g. going from #2 to #3). For example safesaving test.bson a third time will rename the old test_#1.bson to test_#2.bson, rename the old test.bson to test_#1.bson and then save a new test.bson with the latest data.

See also tagsave.

source

Tagging a run using Git

For reproducibility reasons (and also to not go insane when asking "HOW DID I GET THOSE RESUUUULTS") it is useful to "tag" any simulation/result/process with the Git commit of the repository.

To this end we have some functions that can be used to ensure reproducibility:

current_commit(gitpath = projectdir()) -> commit

Return the current active commit id of the Git repository present in gitpath, which by default is the project gitpath. If the repository is dirty when this function is called the string will end with "_dirty".

Return nothing if gitpath is not a Git repository.

See also tag!.

Examples

julia> current_commit()
"96df587e45b29e7a46348a3d780db1f85f41de04"

julia> current_commit(path_to_dirty_repo)
"3bf684c6a115e3dce484b7f200b66d3ced8b0832_dirty"
source
DrWatson.tag!Function.
tag!(d::Dict, gitpath = projectdir()) -> d

Tag d by adding an extra field commit which will have as value the current_commit of the repository at gitpath (by default the project's gitpath). Do nothing if a key commit already exists or if the Git repository is not found.

Notice that if String is not a subtype of the value type of d then a new dictionary is created and returned. Otherwise the operation is inplace (and the dictionary is returned again).

Examples

julia> d = Dict(:x => 3, :y => 4)
Dict{Symbol,Int64} with 2 entries:
  :y => 4
  :x => 3

julia> tag!(d)
Dict{Symbol,Any} with 3 entries:
  :y      => 4
  :commit => "96df587e45b29e7a46348a3d780db1f85f41de04"
  :x      => 3
source
DrWatson.@tag!Macro.
@tag!(d, gitpath = projectdir()) -> d

Do the same as tag! but also add another field script that has the path of the script that called @tag!, relative with respect to gitpath. The saved string ends with #line_number, which indicates the line number within the script that @tag! was called at.

Examples

julia> d = Dict(:x => 3)Dict{Symbol,Int64} with 1 entry:
  :x => 3

julia> @tag!(d) # running from a script or inline evaluation of Juno
Dict{Symbol,Any} with 3 entries:
  :commit => "618b72bc0936404ab6a4dd8d15385868b8299d68"
  :script => "test\stools_tests.jl#10"
  :x      => 3
source

Please notice that tag! will operate in place only when possible. If not possible then a new dictionary is returned. Also (importantly) these functions will never error as they are most commonly used when saving simulations and this could risk data not being saved!

Automatic Tagging during Saving

If you don't want to always call tag! before saving a file, you can just use tagsave or @tagsave, which can also nicely incorporate safesave if need be!

DrWatson.tagsaveFunction.
tagsave(file::String, d::Dict [, safe = false, gitpath = projectdir()])

First tag! dictionary d and then save d in file. If safe = true save the file using safesave.

source
@tagsave(file::String, d::Dict [, safe = false, gitpath = projectdir()])

Same as tagsave but also add a field script that records the local path of the script that called @tagsave, see @tag!.

source

Produce or Load

produce_or_load is a function that very conveniently integrates with savename to either load a file if it exists, or if it doesn't to produce it, save it and then return it!

This saves you the effort of checking if a file exists and then loading, or then running some code and saving, or writing a bunch of if clauses in your code! produce_or_load really shines when used in interactive sessions where some results require a couple of minutes to complete.

produce_or_load([prefix="",] c, f; kwargs...) -> file

Let s = savename(prefix, c, suffix). If a file named s exists then load it and return it.

If the file does not exist then call file = f(c), save file as s and then return the file. The function f must return a dictionary. The macros @dict and @strdict can help with that.

Keywords

  • tag = true : Add the Git commit of the project in the saved file.
  • gitpath = projectdir() : Path to search for a Git repo.
  • suffix = "bson" : Used in savename.
  • force = false : If true then don't check if file s exists and produce it and save it anyway.
  • kwargs... : All other keywords are propagated to savename.

See also savename and tag!.

source