Problem
Imagine we have a one-off task of copying subset of files residing in an s3 bucket to another bucket.
We’re going to use aws s3 ls s3://a_bucket/path/
command to get the list of files and then only copy the ones that match a pattern.
Let’s assume the resulting file-list looks something like this:
$ cat in.txt
2013-03-29 14:35:03 10314739 20130314T011325_WCBK_SCHEDULE.XML
2013-03-29 14:35:07 81378 20130314T012706_CBK_TOURNAMENT.XML
2013-03-29 14:35:07 11659596 20130314T012735_CBK_SCHEDULE.XML
2013-03-29 14:35:18 421 20130314T100002_CBK_LIVE.XML
2013-03-29 14:35:18 421 20130314T100028_CBK_LIVE.XML
2013-03-29 14:35:18 452 20130314T100028_CBK_SCORES.XML
2013-03-29 14:35:18 457 20130314T100634_WCBK_SCORES.XML
2013-03-29 14:35:22 421 20130314T131835_CBK_LIVE.XML
2013-03-29 14:35:22 11707386 20130314T131911_CBK_SCHEDULE.XML
2013-03-29 14:35:22 452 20130314T131938_CBK_SCORES.XML
#and 10K more lines to follow.
Now we want to transform this file-list into a script that copies selected files into a destination bucket.
Solution
While there are many ways to transform text I’d like to use Vim’s ex
mode this time to do the job, just like I’d have used sed
or awk
or perl
or a ruby
script.
Because I use Vim most of the time I often think in “Vim motions” when working with text.
Here’s the resulting command:
$ vim \
-N \
-u NONE \
./in.txt \
-c ':%norm $Bd^' \
-c ':%norm ^ytTP^3lylpr/2lylpr/2lylpr/' \
-c ":%norm Iaws s3 cp 's3://a_bucket/path/" \
-c ":%norm \$a'" \
-c '%:norm $a s3://dest_bucket/path2/' \
-c ':saveas! ./out.txt' \
-c ':qall!'
Running the command above produces text saved to ./out.txt
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T011325_WCBK_SCHEDULE.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T012706_CBK_TOURNAMENT.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T012735_CBK_SCHEDULE.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T100002_CBK_LIVE.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T100028_CBK_LIVE.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T100028_CBK_SCORES.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T100634_WCBK_SCORES.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T131835_CBK_LIVE.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T131911_CBK_SCHEDULE.XML' s3://dest_bucket/path2/
aws s3 cp 's3://a_bucket/path/2013/03/14/20130314T131938_CBK_SCORES.XML' s3://dest_bucket/path2/
#and 10K more lines to follow.
Now we can just source out.txt
to get the copying going.
What’s happened
Refer back to how vim
is being run above, line by line:
$ vim
startsVim
-u NONE
tellsvim
not to use.vimrc
(just to avoid using any plugins etc)./in.txt
tellsvim
to read the./in.txt
file-c ':%norm $Bd^'
is anormal mode
command that is applied to every line%
, and is a motion of going to the end of a line$
, returning to theB
eginning of a word, andd
eleting till the beginning of the line^
.-c ':%norm ^ytTP^3lylpr/2lylpr/2lylpr/'
- copies part of a timestamp and turns into a/2013/mon/day
path-c ":%norm Iaws s3 cp 's3://a_bucket/path/"
- inserts theaws s3 cp 's3://a_bucket/path/
at the beginning of each line-c ":%norm \$a'"
- appends the quote. Note$
is escaped for the Bash-c ':%norm $a s3://dest_bucket/path2/'
- appends the destination bucket name to the end of each line-c ':saveas! ./out.txt'
- saves the resulting file-c ':qall!'
- finishes the processing
Conclusion
- the resulting script is not the most efficient one as launches a process per file
- it may not be the best tool if you have to share/or reuse the script with someone not familiar with Vim
- each step/stage could be copied and run/tested in vim
:norm
is pretty awesome command to runnormal mode
commands incommand mode
.