localuser@bscc-vm-vikas:~$
Roman E. Reggiardo, Vikas Peddu
11 July, 2023
What does data look like? Where do we keep it?
The command line gives us an opportunity to work with computers in a more direct way compared to “Folders” and “Files”.
Look at your desktop, what’s there?
Folders?
What’s inside?
Really they operate in binary, on 1’s and 0’s, but just beneath ‘Files’ and ‘Folders’ they operate with:
“Folders” == “Directories”
“Files” == “Paths”
How will your experience be different when you switch from “human think” to “command line/computer think”?
Using the Graphical User Interface (GUI), navigate to the Documents folder and create a Folder called:
BSCC_2023_folder
README.txt
file:data/
notes/
code/
make sure to keep the folder names lowercase!
Why is this useful? What could it help with moving forward?
We can create, copy, rename, move (etc etc) files using a GUI, it works, its easy and intuitive.
only possible if a GUI is provided
not reproducible (could someone else get the same result? yes…but in the same exact way? maybe not.)
limited to what the GUI/Operating System designers decided was useful for most users.
Bioinformatics asks us to do much more with data, and thus computers, than most.
On your desktops you can view and interact with the command line with a terminal – a piece of software built to enable our use of command line programming
Instead of the mouse, how will we navigate the command line?
command line
localuser@bscc-vm-vikas:~$
user
@ host
: location
$
commands you enter here
user
: the name your computer calls you (we’re all bscc-VM)
host
: the computer you’re using
location
: the directory/folder
you’re in
$
: prompt that tells you commands come next
If you see something like this
its code (and expected output below) that you can execute in your terminal on the command line. Just copy and paste or type it in yourself, hit enter, and follow along
Lines that start with #
and are grayed out are comments that don’t run as code; they are just notes
The command line accepts…commands!
These tell us where we are and what’s in here with
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
cd
Our command line “mouse” – move somewhere, anywhere
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/BSCC_2023_folder
hint: try hitting tab to complete the rest of your statements – it’ll auto-complete (like texting, etc)
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/BSCC_2023_folder
These are really useful shortcuts that correspond to location
# change directory to BSCC_2023_folder
cd BSCC_2023_folder
# change directory to the one above/before/in front of pwd
cd ..
pwd .
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
as we just saw, cd ..
tells us to go from:
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/BSCC_2023_folder
to
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
These are really useful shortcuts that correspond to location
# change directory to BSCC_2023_folder
cd BSCC_2023_folder
# change directory to the one above/before/in front of pwd
cd ..
pwd .
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
looking again, pwd .
is the same as pwd
becuase .
is equivalent to pwd, ..
is one step up
pwd
telling us?what pwd
‘returns’ is a string of text
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
it is the path to our present working directory:
each directory along the path is separated by /
When you cd
to a different directory, it has to be in a valid path
from
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
we can go to
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/BSCC_2023_folder
but not to Documents
To get to the Documents
directory from
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
directly, we need to provide the entire path
/Users/vikas/Documents/UCSC/teaching
looks like:
before beginning please run:
cd
does?Pictures/
directory, is anything in there?BSCC_2023_folder/data/
directory, what does pwd
return?1.
Do you like moving around with the command line? It is easier than mouse? Is it more challenging?
pwd
: return present working directory - man pwd
ls
: list contents of directory - man ls
cd
: change directory - man cd
. and ..
: represent present working directory and upstream directory, respectively
man
ualThe man
tool prints out a manual for the tool you provide it
PWD(1) BSD General Commands Manual PWD(1)
NNAAMMEE
ppwwdd -- return working directory name
SSYYNNOOPPSSIISS
ppwwdd [--LL | --PP]
DDEESSCCRRIIPPTTIIOONN
The ppwwdd utility writes the absolute pathname of the current working
directory to the standard output.
Some shells may provide a builtin ppwwdd command which is similar or identi-
cal to this utility. Consult the builtin(1) manual page.
The options are as follows:
--LL Display the logical current working directory.
--PP Display the physical current working directory (all symbolic
links resolved).
If no options are specified, the --LL option is assumed.
EENNVVIIRROONNMMEENNTT
Environment variables used by ppwwdd:
PWD Logical current working directory.
EEXXIITT SSTTAATTUUSS
The ppwwdd utility exits 0 on success, and >0 if an error occurs.
SSEEEE AALLSSOO
builtin(1), cd(1), csh(1), sh(1), getcwd(3)
SSTTAANNDDAARRDDSS
The ppwwdd utility conforms to IEEE Std 1003.1-2001 (``POSIX.1'').
BBUUGGSS
In csh(1) the command ddiirrss is always faster because it is built into that
shell. However, it can give a different answer in the rare case that the
current directory or a containing directory was moved after the shell
descended into it.
The --LL option does not work unless the PWD environment variable is
exported by the shell.
BSD April 12, 2003 BSD
Now we know how to move, what else have we done with the file explorer/mouse that we can do on the command line?
Moving is great, but we can do much more
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
BSCC_2023_folder
temp.txt
touch
is our command to create a file, not a directory
We also will need to create directories
to help organize our files
since its a directory, we can cd
into it
If we want to organize, we need to be able to move things
# move a **file** to a different **directory**
# mv [file you want to move] [destination you want to target]
mv temp.txt temporary_dir
ls temporary_dir
temp.txt
The file is no longer present in our pwd
:
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
BSCC_2023_folder
temporary_dir
Sometimes moving isn’t quite what we want, in that case we can copy a file and keep the original
# copy a **file** to a different **directory**
# cp [file you want to move] [destination you want to target]
cp temporary_dir/temp.txt .
ls temporary_dir
temp.txt
The file now exists in both directories
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
BSCC_2023_folder
temp.txt
temporary_dir
Well, we called these temporary, so let’s get rid of them carefully
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
BSCC_2023_folder
temporary_dir
this doesn’t destroy the copy in temporary_dir
We can also remove directories
with -rf
(we’ll come back to this later)
now try to cd
into our deleted directory
this also destroys the files within
For the same task, which is faster: file explorer or command line?
Make the directory BSCC_2023_dir
with the same contents as BSCC_2023_folder
.
Delete the BSCC_2023_folder
directory.
There are multiple ways to do this, see if you can find the fastest approach.
Is it starting to seem like the command line might do some things faster/cleaner/better than the file explorer?
We’ve already experienced a minor road block in the GUI – there’s no text editor installed.
The command line can help us here, just print the whole file out in terminal:
cat
works great when we have existing files with content, but what about printing in general?
echo
is not limited to sending stuff to terminal, we can also send this output to a file
# `>` tells cmd line to send the output to a specific file, rather than printing to terminal
echo 'Banana slugs have no known predators' > BSCC_2023_dir/README.txt
cat BSCC_2023_dir/README.txt
Banana slugs have no known predators
Something new here: >
is letting us write our echo
output to our BSCC_2023_dir/README.txt
file.
But in this case, we over-wrote our original statement!
How can we re-generate our original BSCC_2023_dir/README.txt
?
Much like .
and ..
, >
and >>
have distinct functions.
## `>>` lets us append output to an existing file, line by line
echo 'Banana slugs have no known predators' >> BSCC_2023_dir/README.txt
cat BSCC_2023_dir/README.txt
Vikas, Roman, and Daniel love to code!
Banana slugs have no known predators
now we have both lines!
echo
would be a pretty laborious way to write something like an essay, but what real uses might it have?
cat
is great when we want to see everything, but sometimes files are huge and that would be….difficult.
We can also search for key terms in files:
# `grep` will return lines that contain the input argument
# grep [text to search for] [file to search in]
grep "Roman" BSCC_2023_dir/README.txt
Vikas, Roman, and Daniel love to code!
the text doesn’t have to be in the beginning:
Does this beat opening up Microsoft word or are you still skeptical?
echo
: prints whatever text you want – man echo
cat
: prints the entirety of a file (more uses soon) - man cat
grep
: finds your input text inside of a given file - man grep
rm
: irreversibly deletes files, the -rf
flag enables recursive deletion of directories - man rm
mv
: moves a file from A to B directory - man mv
cp
: copy a file from A to B directory - man cp
cp
cp
: copy
Before you run the following code, cd
to your BSCC_2023_dir/data
directory
Now you’ve got a file: talking.txt
(make sure you know where it is).
It contains two types of line
: A line for speakers
and a line for statements
.
head -2
and tail -2
, what do they do?finally, use echo
to add your own speaker ID line and statement of choice (hint: \n
can be used to represent a new line).
What other steps in the last practice could we theoretically use command line for instead of doing them manually?
Part 1:
Within your BSCC_2023_dir/data
directory, create a new file: book_of_poems.txt
.
Using echo
and the \n
trick, add as many lines of speakers
and statements
as you’d like to, following the format of Practice 04.
Part 2:
Now, utilizing grep --help
, figure out a way to make two new files:
poetry.txt
: that contains only the statements
from book_of_poems.txt
(maybe they can form a coherent poem together?).authors.txt
: that contains only the names of the authors (plus any annotation that already existed).cp
: copy files - man cp
head
: print only the top 6 lines of a file (default) - man head
tail
: print only the last 6 lines of a file (default) - man tail
What does automation mean, to you, with work that you might try to accomplish with data?