I/O and file management
◤
Absolute path vs relative path
Address: 高雄市鼓山區蓮海路70號
Two houses away from me
I live here
Absolute path
Relative path
◤
Where can we see the file path
At the terminal of VS code
◤
Very basic linux commands
◤
Very basic linux commands
cd $PATH – change a folder
.. – change to the parent folder
. – current folder
◤
Path system for Linux/Mac
home
silasysh
ANNOgesic
folder1
goatools
Absolute path
Relative path
Absolute path can enter the target folder from everywhere of the computer
◤
Path system for Windows
C:
shyu
ANNOgesic
folder1
goatools
◤
Exercise
C:
Paul
folder2
folder1
folder3
folder4
folder5
folder6
Users
◤
Open file
◤
read()
1. In Windows, using \\ is needed due to the special symbol.
2. Don’t forget the filename extension like .txt
6 characters including a space.
File holder
◤
readline() and readlines()
Each line in the file including a “\n”
◤
strip
◤
readlines with a for loop
Remove the “\n”
Do something for each line.
Normally each line is a data from a gene, protein or sample.
◤
close
Default is “r”
◤
Exercise
The header starts from >
Accession number
Strain name
◤
◤
seek
is not executed
◤
write
No “\n”
◤
Run the same script with different strings –> mode = “w”
Run the same script with different strings –> mode = “a”
Overwritten
append
◤
writelines
◤
◤
Exercise
◤
◤
csv
◤
With open
No f.close()
No f.close()
◤
Exercise
◤
◤
sys.argv
Stop the script immediately
The name of script
The input information
Besides script name, it still need another input message
◤
argparse
◤
args.XXX is used for calling the input. XXX is referred from the full name of argument
◤
Running without inputs. It will tell you what kinds of parameters can be assigned, and also the required ones.
If required is set as True, this parameter must be assigned while running the script.
Use default setting
◤
No input message needed. It will turn False to True
action=“store_true” means that when –sr was used, args.single_room will become True, otherwise, the default is False. “store_false” is the opposite as “store_true”.
◤
Each item is separated by a space
Using nargs=“+” means this argument is a list
◤
-h or –help will print the help information.
◤
Exercise
◤
◤
Exercise
the feature’s name, such as gene, CDS, tRNA…
◤
Define argument
Main function has three big steps – read_fasta, read_gff, and write_seq.
Moreover, using gene_seqs to store the output information.
◤
Read fasta – this is almost the same as previous exercise, except returning two outputs. (ac for AC number and seq for whole sequence)
◤
AGAACT
TCTTGA
5’
3’
5’
3’
◤
For complement
For reverse
Write the file as fasta format
◤
◤