14. How can I work on a remote server?#
Today we will connect to a remote server and learn new bash commands for working with the content of files.
14.1. What are remote servers and HPC systems?#
14.2. Connecting to Seawulf#
We connect with secure shell or ssh
from our terminal (GitBash or Putty on windows) to URI’s teaching High Performance Computing (HPC) Cluster Seawulf.
Our login is the part of your uri e-mail address before the @ and I will tell you how to find your default password if you missed class (do not want to post it publicly). Comment on your experience report PR to ask for this information.
ssh -l brownsarahm seawulf.uri.edu
When it logs in it looks like this and requires you to change your password. They configure it with a default and with it past expired.
The authenticity of host 'seawulf.uri.edu (131.128.217.210)' can't be established.
ECDSA key fingerprint is SHA256:RwhTUyjWLqwohXiRw+tYlTiJEbqX2n/drCpkIwQVCro.
Are you sure you want to continue connecting (yes/no/[fingerprint])? y
Please type 'yes', 'no' or the fingerprint: yes
Warning: Permanently added 'seawulf.uri.edu,131.128.217.210' (ECDSA) to the list of known hosts.
brownsarahm@seawulf.uri.edu's password:
You are required to change your password immediately (root enforced)
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for user brownsarahm.
Changing password for brownsarahm.
(current) UNIX password:
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
Connection to seawulf.uri.edu closed.
Important
You use the default password when prompted for your username’s password. Then again when it asks for the (current) UNIX password:
. Then you must type the same, new password twice.
Choose a new password you will remember, we will come back to this server
after you give it a new password, then it logs you out and you have to log back in.
brownsarahm@~ $ ssh -l brownsarahm seawulf.uri.edu
brownsarahm@seawulf.uri.edu's password:
Last login: Tue Mar 8 12:52:38 2022 from 172.20.133.152
We have logged into our home directory which is empty
[brownsarahm@seawulf ~]$ ls
[brownsarahm@seawulf ~]$ pwd
/home/brownsarahm
[brownsarahm@seawulf ~]$ whoami
brownsarahm
Notice that the prompt says uriusername@seawulf
to indicate that you are logged into the server, not working locally.
14.3. Downloading files#
wget
allows you to get files from the web.
[brownsarahm@seawulf ~]$ wget http://www.hpc-carpentry.org/hpc-shell/files/bash-lesson.tar.gz
--2022-03-08 12:58:09-- http://www.hpc-carpentry.org/hpc-shell/files/bash-lesson.tar.gz
Resolving www.hpc-carpentry.org (www.hpc-carpentry.org)... 104.21.33.152, 172.67.146.136
Connecting to www.hpc-carpentry.org (www.hpc-carpentry.org)|104.21.33.152|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12534006 (12M) [application/gzip]
Saving to: ‘bash-lesson.tar.gz’
100%[======================================>] 12,534,006 4.19MB/s in 2.9s
2022-03-08 12:58:12 (4.19 MB/s) - ‘bash-lesson.tar.gz’ saved [12534006/12534006]
[brownsarahm@seawulf ~]$ ls
bash-lesson.tar.gz
[brownsarahm@seawulf ~]$ tar -xvf bash-lesson.tar.gz
dmel-all-r6.19.gtf
dmel_unique_protein_isoforms_fb_2016_01.tsv
gene_association.fb
SRR307023_1.fastq
SRR307023_2.fastq
SRR307024_1.fastq
SRR307024_2.fastq
SRR307025_1.fastq
SRR307025_2.fastq
SRR307026_1.fastq
SRR307026_2.fastq
SRR307027_1.fastq
SRR307027_2.fastq
SRR307028_1.fastq
SRR307028_2.fastq
SRR307029_1.fastq
SRR307029_2.fastq
SRR307030_1.fastq
SRR307030_2.fastq
[brownsarahm@seawulf ~]$ ls
bash-lesson.tar.gz SRR307026_1.fastq
dmel-all-r6.19.gtf SRR307026_2.fastq
dmel_unique_protein_isoforms_fb_2016_01.tsv SRR307027_1.fastq
gene_association.fb SRR307027_2.fastq
SRR307023_1.fastq SRR307028_1.fastq
SRR307023_2.fastq SRR307028_2.fastq
SRR307024_1.fastq SRR307029_1.fastq
SRR307024_2.fastq SRR307029_2.fastq
SRR307025_1.fastq SRR307030_1.fastq
SRR307025_2.fastq SRR307030_2.fastq
14.4. Working with large files#
One of these files, contains the entire genome for the common fruitfly, let’s take a look at it: \
[brownsarahm@seawulf ~]$ cat dmel-all-r6.19.gtf
X FlyBase gene 19961297 19969323 . + . gene_id "FBgn0031081"; gene_symbol "Nep3";
2L FlyBase 3UTR 782825 782885 . + . gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
Warning
this output is truncated for display purposes
We see that this actually take a long time to output and is way tooo much information to actually read. In fact, in order to make the website work, I had to cut that content using command line tools, my text editor couldn’t open the file and GitHub was unhappy when I pushed it.
For a file like this, we don’t really want to read the whole file but we do need to know what it’s strucutred like in order to design programs to work with it.
head
lets us look at the first 10 lines.
[brownsarahm@seawulf ~]$ head dmel-all-r6.19.gtf
X FlyBase gene 19961297 19969323 . + . gene_id "FBgn0031081"; gene_symbol "Nep3";
X FlyBase mRNA 19961689 19968479 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X FlyBase 5UTR 19961689 19961845 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X FlyBase exon 19961689 19961845 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X FlyBase exon 19963955 19964071 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X FlyBase exon 19964782 19964944 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X FlyBase exon 19965006 19965126 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X FlyBase exon 19965197 19965511 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X FlyBase exon 19965577 19966071 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X FlyBase exon 19966183 19967012 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
We can use the -n
parameter to change the number.
And, tails shows the last few.
[brownsarahm@seawulf ~]$ tail dmel-all-r6.19.gtf
which in this case looks mostly the same
2L FlyBase exon 782124 782181 . + . gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase exon 782238 782441 . + . gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase exon 782495 782885 . + . gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase start_codon 781297 781299 . + 0 gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase CDS 781297 782048 . + 0 gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase CDS 782124 782181 . + 1 gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase CDS 782238 782441 . + 0 gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase CDS 782495 782821 . + 0 gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase stop_codon 782822 782824 . + 0 gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
2L FlyBase 3UTR 782825 782885 . + . gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
We can also see how much content is in the file wc
give a word count and with its -l
parameter gives us the number of lines.
[brownsarahm@seawulf ~]$ wc -l dmel-all-r6.19.gtf
542048 dmel-all-r6.19.gtf
Over five hundred forty thousand lines is a lot.
How can we get the number of lines in each of the .fastq
files?
[brownsarahm@seawulf ~]$ wc -l *.fastw
wc: *.fastw: No such file or directory
note that in my typo, it tells me no files matched my pattern.
[brownsarahm@seawulf ~]$ wc -l *.fastq
20000 SRR307023_1.fastq
20000 SRR307023_2.fastq
20000 SRR307024_1.fastq
20000 SRR307024_2.fastq
20000 SRR307025_1.fastq
20000 SRR307025_2.fastq
20000 SRR307026_1.fastq
20000 SRR307026_2.fastq
20000 SRR307027_1.fastq
20000 SRR307027_2.fastq
20000 SRR307028_1.fastq
20000 SRR307028_2.fastq
20000 SRR307029_1.fastq
20000 SRR307029_2.fastq
20000 SRR307030_1.fastq
20000 SRR307030_2.fastq
320000 total
when it does work, we also get the total.
We can use redirects as before to save these to a file:
[brownsarahm@seawulf ~]$ wc -l *.fastq > linecounts.txt
[brownsarahm@seawulf ~]$ cat linecounts.txt
20000 SRR307023_1.fastq
20000 SRR307023_2.fastq
20000 SRR307024_1.fastq
20000 SRR307024_2.fastq
20000 SRR307025_1.fastq
20000 SRR307025_2.fastq
20000 SRR307026_1.fastq
20000 SRR307026_2.fastq
20000 SRR307027_1.fastq
20000 SRR307027_2.fastq
20000 SRR307028_1.fastq
20000 SRR307028_2.fastq
20000 SRR307029_1.fastq
20000 SRR307029_2.fastq
20000 SRR307030_1.fastq
20000 SRR307030_2.fastq
320000 total
We can also search files, without loading them all into memory or displaying them, with grep
:
[brownsarahm@seawulf ~]$ grep Act5c dmel-all-r6.19.gtf
[brownsarahm@seawulf ~]$ grep mRNA dmel-all-r6.19.gtf
this output a lot, so the output is truncated here
X FlyBase mRNA 19961689 19968479 . + . gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
2L FlyBase mRNA 781276 782885 . + . gene_id "FBgn0041250"; gene_symbol "Gr21a"; transcript_id "FBtr0331651"; transcript_symbol "Gr21a-RB";
and we can combine grep
with wc
to count occurences.
[brownsarahm@seawulf ~]$ grep mRNA dmel-all-r6.19.gtf | wc -l
34025
14.5. File permissions#
Let’s make a small script, recalling what we have learned so far:
[brownsarahm@seawulf ~]$ echo "echo 'script works'" >> demo.sh
We can confirm that the script looks like a we expected
[brownsarahm@seawulf ~]$ cat demo.sh
echo 'script works'
One thing we could do is to run the script using ./
[brownsarahm@seawulf ~]$ ./demo.sh
but we get a permission denied error
-bash: ./demo.sh: Permission denied
By default, files have different types of permissions: read, write, and execute for different users that can access them. To view the permissions, we can use the -l
option of ls
.
[brownsarahm@seawulf ~]$ ls -l
total 138452
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18 2021 bash-lesson.tar.gz
-rw-r--r--. 1 brownsarahm spring2022-csc392 20 Mar 8 13:12 demo.sh
-rw-r--r--. 1 brownsarahm spring2022-csc392 77426528 Jan 16 2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392 721242 Jan 25 2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 25056938 Jan 25 2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392 447 Mar 8 13:07 linecounts.txt
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625262 Jan 25 2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625262 Jan 25 2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625376 Jan 25 2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625376 Jan 25 2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625286 Jan 25 2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625286 Jan 25 2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625302 Jan 25 2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625302 Jan 25 2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625312 Jan 25 2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625312 Jan 25 2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625338 Jan 25 2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625338 Jan 25 2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625390 Jan 25 2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625390 Jan 25 2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625318 Jan 25 2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625318 Jan 25 2016 SRR307030_2.fastq
For each file we get 10 characters in the first column that describe the permissions. The 3rd column is the username of the owner, the fourth is the group, then size date revised and the file name.
We are most interested in the 10 character permissions. The fist column indicates if any are directories with a d
or a -
for files. We have no directories, but we can create one to see this.
[brownsarahm@seawulf ~]$ mkdir results
[brownsarahm@seawulf ~]$ ls -l
total 138452
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18 2021 bash-lesson.tar.gz
-rw-r--r--. 1 brownsarahm spring2022-csc392 20 Mar 8 13:12 demo.sh
-rw-r--r--. 1 brownsarahm spring2022-csc392 77426528 Jan 16 2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392 721242 Jan 25 2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 25056938 Jan 25 2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392 447 Mar 8 13:07 linecounts.txt
**drwxr-xr-x. 2 brownsarahm spring2022-csc392 10 Mar 8 13:16 results**
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625262 Jan 25 2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625262 Jan 25 2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625376 Jan 25 2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625376 Jan 25 2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625286 Jan 25 2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625286 Jan 25 2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625302 Jan 25 2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625302 Jan 25 2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625312 Jan 25 2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625312 Jan 25 2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625338 Jan 25 2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625338 Jan 25 2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625390 Jan 25 2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625390 Jan 25 2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625318 Jan 25 2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625318 Jan 25 2016 SRR307030_2.fastq
We can see in the bold line, that the first character is a d.
The next nine characters indicate permission to Read, Write, and eXecute a file. With either the letter or a -
for permissions not granted, they appear in three groups of three, three characters each for owner, group, anyone with access.
If we want to run the file, we can instead use bash
directly, but this is limited relative to calling our script in other ways.
[brownsarahm@seawulf ~]$ bash demo.sh
script works
Instead, to add execute permission, we can use chmod
[brownsarahm@seawulf ~]$ chmod +x demo.sh
[brownsarahm@seawulf ~]$ ls -l
total 138452
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18 2021 bash-lesson.tar.gz
-rwxr-xr-x. 1 brownsarahm spring2022-csc392 20 Mar 8 13:12 demo.sh
-rw-r--r--. 1 brownsarahm spring2022-csc392 77426528 Jan 16 2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392 721242 Jan 25 2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 25056938 Jan 25 2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392 447 Mar 8 13:07 linecounts.txt
drwxr-xr-x. 2 brownsarahm spring2022-csc392 10 Mar 8 13:16 results
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625262 Jan 25 2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625262 Jan 25 2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625376 Jan 25 2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625376 Jan 25 2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625286 Jan 25 2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625286 Jan 25 2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625302 Jan 25 2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625302 Jan 25 2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625312 Jan 25 2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625312 Jan 25 2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625338 Jan 25 2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625338 Jan 25 2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625390 Jan 25 2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625390 Jan 25 2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625318 Jan 25 2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625318 Jan 25 2016 SRR307030_2.fastq
[brownsarahm@seawulf ~]$ ./demo.sh
script works
We can add a bit more to our script to make it more interesting
[brownsarahm@seawulf ~]$ nano demo.sh
for VAR in *.gz
do
echo $VAR
done
echo 'script works'
and note that that does not change the permission.
[brownsarahm@seawulf ~]$ ls -l
total 138452
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18 2021 bash-lesson.tar.gz
-rwxr-xr-x. 1 brownsarahm spring2022-csc392 60 Mar 8 13:27 demo.sh
-rw-r--r--. 1 brownsarahm spring2022-csc392 77426528 Jan 16 2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392 721242 Jan 25 2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 25056938 Jan 25 2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392 447 Mar 8 13:07 linecounts.txt
drwxr-xr-x. 2 brownsarahm spring2022-csc392 10 Mar 8 13:16 results
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625262 Jan 25 2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625262 Jan 25 2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625376 Jan 25 2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625376 Jan 25 2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625286 Jan 25 2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625286 Jan 25 2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625302 Jan 25 2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625302 Jan 25 2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625312 Jan 25 2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625312 Jan 25 2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625338 Jan 25 2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625338 Jan 25 2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625390 Jan 25 2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625390 Jan 25 2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625318 Jan 25 2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392 1625318 Jan 25 2016 SRR307030_2.fastq
14.6. Review today’s class#
Important
This is an integrative 2x badge.
File permissions are represented numerically often in octal, by transforming the permissions for each level (owner, group, all) as binary into a number. Add octal-review.md to your KWL repo and answer the following.
1. Transform the permissions [`r--`, `rw-`, `rwx`] to octal, by treating it as three bits. 1. Transform the permission we changed our script to `rwxr-xr-x` to octal. 1. Which of the following would prevent a file from being edited by other people 777 or 755?
create a
vocab-quiz.md
file with 10 mutliple choice questions that cover topics from at least 5 different class sessions. Each question should have 4 options, 1 correct and 3 that represent a reasonable, but incorrect idea someone may have. Questions should check that a person understand the key terms of the first half of the course. For each option explain why it is/not correct in a way that would help clarify someone’s confusion if they had picked that answer instead of the correct answer. Use the following syntax:
Question text
- [ ] a wrong answer
- [ ] another wrong
- [x] correct answer marked with x
- [ ] another wong
---
- explanation for first wrong
- explanation for second wrong
- key point about correct
- explantion for third wrong
---
Next question
14.7. Prepare for Next Class#
Review your notes on IDE use,make sure they are complete (from 2023-02-23 Prepare)
Preview the Stack Overflow Developer Survey Technology section parts that are about tools.
14.8. Practice#
Important
This is an integrative 2x badge.
File permissions are represented numerically often in octal, by transforming the permissions for each level (owner, group, all) as binary into a number. Add octal-practice.md to your KWL repo and answer the following.
1. Transform the permissions [`r--`, `rw-`, `rwx`] to octal, by treating it as three bits. 1. Transform the permission we changed our script to `rwxr-xr-x` to octal. 1. What permissions would we want (both long and in octal) would allow only the owner to run a file? 1. Which of the following would prevent a file from being edited by other people 777 or 755?
create a
midterm.md
file in your kwl repo with 10 mutliple choice questions that cover topics from at least 5 different class sessions. Each question should have 4 options, 1 correct and 3 that represent a reasonable, but incorrect idea someone may have. All 10 questions should check understanding of key concepts, not only terminology or the name of a command. For each option explain why it is/not correct in a way that would help clarify someone’s confusion if they had picked that answer instead of the correct answer. Use the following syntax:
Question text
- [ ] a wrong answer
- [ ] another wrong
- [x] correct answer marked with x
- [ ] another wong
---
- explanation for first wrong
- explanation for second wrong
- key point about correct
- explantion for third wrong
---
Next question
14.9. Experience Report Evidence#
No specific files.
14.10. Questions After Today’s Class#
14.10.1. What is meant by the term “gene symbol” ?#
For the purpose of this class it is just a bit of the content in the file. For completeness, since the file is genomic data is is a name for the gene at that line.
14.10.2. How would we change permissions for just one type of user with chmod instead of all users?#
There are two ways. One is to se the permision you want it to end up at, for exame chmod 744 demo.sh
. Another is to use more of the modifiers:
“g” is for group
“o” is for others
“-” is for removing permissions
“r” is for read-permission
“w” is for write-permission
“x” is for execute permission.
“u” is for user / owner
“+” is for adding permissions
So, instead of chmod +x demo.sh
you could do chmod u+x demo.sh
, or since we already did chmod +x demo.sh
you could remove that access from the ones we do not want with chmod go-x
.
you can see all of this and examples on the man page
14.10.3. When would I use these remote servers most often?#
Remote servers are used for production code, and for large computations in any sort of science setting or machine learning setting.
14.10.4. Is it normal to be able to change file permissions? In the case in class i think it worked because it was our own file, but we wouldnt be able to just change other peoples permissions right?#
Correct you have to have write permission to be able to do this.
14.10.5. What type of data structure for storage does the server we used today follow? How does it version track and store files?#
It is a regular unix system with a standard file system. If we want to track versions, we use git. (or another version control system)
14.10.6. Will the files on our seawulf, still be available after exiting.#
Yes, but not forever. Seawulf is a teaching server that makes no promise of backing up or keeping your data forever.
14.10.7. What are .fastq and .gz files?#
.gz is a zip file, .fastq is a plain text output of a sequencing tool.
14.10.8. Can we connect to other servers other than the URI server?#
Yes, if you have credentials, the ssh works basically the same way. We will learn one more thing that will give you more ability to use other servers next week.
14.10.9. What is the point of a remote server?#
More compute power.
14.10.10. Do the fiels download directly to my computer or to the remote server?#
The file goes to the system where the command is run, in this case, it went to the the server.
14.10.11. where is the ssh server downloaded on my computer?#
ssh is the connection protocol or the set of rules by which our information is sent.
The server is another computer, you were using that computer through your local terminal.
We did not create any files on your local system today.
14.10.12. Is access to the chmod command restricted on computers and servers?#
on a file by file basis typically, eys.
14.10.13. Can I access the remote files using vscode and write code?#
Theoretically yes, but a more typical workflow would be to edit large files locally running vscode on your system then send the code to the remote server to run it. You might send one file at a time directly there or push from your local system to for example GitHub and then from GitHub on the server.
14.10.14. Will I work with files more like this or locally as a software engineer?#
Maybe not this type specifically, but you will almost surely encounter a large file at some point. You will also likely encounter a server for some reason.
14.10.15. for files that are impossible to open with text editors because they’re too long, is it possible to edit parts of the file just through terminal commands?#
Yes! This is a good explore badge as well.
14.10.16. How can I zip from tthe terminal?#
zip file file
see the man
14.10.17. Are you able to run a website through a terminal?#
You could launch a web server from a terminal. You can also launch a browser.
For he course website when I build the pdf, version a bash script launches a browser, uses the browsers print to pdf function and closes the browser.