We
congratulate
you on choosing Hopkins and look forward to working with
you. This letter, from the Biostatistics Information Technology
Committee (BIT), is to help you get acquainted with the computing
environment in the Department of Biostatistics, and get you started
on the software that you will need. Many of your computing questions
not directly addressed by this letter will be answered by the BIT web
site: http://www.biostat.jhsph.edu/bit/.
Training
Completing a Biostatistics degree requires well developed computing skills. The Department of Biostatistics offers many opportunities for students of all skill-levels and backgrounds to improve their computing skills. In addition to implementing the recommendations in this letter and spending some time playing around with the statistics-related programs on your machine, there are three main department-sanctioned opportunities to learn more about computing -- we encourage you to check out as many of these as you think you need!
Students within the department offer a series of lunchtime
computing sessions for incoming and current students in any Biostatistics degree program. The web site for the "computing club" is at http://www.biostat.jhsph.edu/bit/compintro/.
It is highly recommended that you attend computing club in your first
year, as many of the sessions are geared towards helping familiarize
you with the programs and the computing environment that you will be
using over the next few years.
The Biostatistics faculty offer a course, 140.776, on some the core programming tools necessary for completing a degree in Biostatistics.
The Bioinformatics program
also offers several courses in computing topics such as perl and
database management.
In addition, the Johns Hopkins Homewood campus offers many other courses which may be of interest. See the course offerings at www.jhu.edu for more details.
Getting started
We require all incoming students to have a personal laptop. Please
contact Cindy Hockett (chockett@jhsph.edu)
if you require assistance in obtaining a laptop.
Choose a laptop/operating system that you are comfortable with. There
are students and faculty who use all of the mainstream operating
systems like Mac, Windows and Linux. Any modern laptop that can handle
document editing will suffice. However, students buying budget laptops
will likely need to do most of their numerical computing on the cluster
(see below).
The school has several wireless networks that are active throughout the building. Therefore, if your laptop does not already come with one, make sure that you get a wireless network card. Any computer with a wireless card can connect to the "guest" network which is not quite as fast and has some security limitations. To get onto the official wireless network of the school, you will need to visit the School's Information Systems (IS) Department, W3014, and turn in this form: http://www.jhsph.edu/IS/Forms/Account_Request_Student.pdf.
Departmental computing resources
Student offices have network jacks and cables, so you will have fast connections to our departmental servers. As mentioned, nearly all of the building is wireless capable, so you can work in the many common areas, such as the coffee shop on the second floor. Two labs, the Biostatistics Library and the Genome Cafe, offer additional work areas and a few high-end stand alone computers for memory-intensive interactive work.
Our
department, jointly with two other departments, hosts
some of the best distributed computing resources around. There will be lectures (and possibly a computing club session or two) on the use of our high performance computing cluster. Each cluster node has two 64 bit processors and the memory
configuration is designed to accommodate the varying needs of
biostatistics research. A login machine, called Enigma, is
used for accessing the cluster. Enigma is accessible from outside of the school. We strongly recommend that students save their
important research files on Enigma, which is backed up daily.
School-wide computing resources
In addition to departmental resources, the School of Public Health offers services such as the wireless network, webmail and the my.jhsph portal. The IS department offers students informational seminars on the school-wide resources. In addition, IS will help students install anti-virus software on their computers. Visit the IS web site at http://www.jhsph.edu/IS/index.html for more information.
Software
The software below is required for our students to get up and running in our environment. Everything except the Xserver (for PCs) can be downloaded free of charge. You should attempt to download and install this software before arriving at Hopkins.
For Macintosh OS X Users
The most important software to get is R, which can be downloaded at cran.r-project.org. This program is a statistical computing language that most of our faculty and students use. It is both free of cost and open source. After installing R, open it and click on “Help” then “Manuals” then “Introduction to R” to see the PDF. R changes approximately every 6 months and so you should periodically update it.
Fernando, the chair of the BIT committee, has kept a log of his OS X installation notes http://www.pinedalab.jhsph.edu/Bit/StupidMacTricks.
Many students and faculty use the document preparation and presentation programs in Microsoft's Office suite. As an alternative, consider OpenOffice (for the Mac there is also NeoOffice) which is freely available for all of the popular operating systems, or even Google Docs. Spreadsheet programs, such as Microsoft's Xcel or Gnumeric, are generally insufficient for the advanced needs of real statisticians, and so these programs are less useful.
For Microsoft Windows Users
The most important software to get is R, which can be downloaded at cran.r-project.org. This program is a statistical computing language that all of our faculty and students use. It is both free of cost and open source. For Windows users, at the R web site, click on: “R Binaries”, then “Windows”, then “base”, then “RVersion.exe”, then follow the instructions. After installing R, open it and click on “Help” then “Manuals” then “Introduction to R” to see the PDF manual. R changes approximately every six months and so you should periodically update it.
You
will need some software to edit programs. You should NEVER edit
programs in Microsoft Word or Notepad. Some choices for
editors are the emacs editor
www.gnu.org/software/emacs/windows/ntemacs.html
, WinEdt or notepad++, . Note that the emacs editor has a high learning curve, but is what many of the students and faculty use.
You
will need a secure shell client program. These programs will allow
you to connect to our servers. Putty is free client and is available at http://www.chiark.greenend.org.uk/~sgtatham/putty/ . Winscp is a file transfer program that you will need to get files to and from
enigma: http://winscp.net/eng/index.php.
You will need a copy of LaTeX, a typesetting program. The most popular version for Microsoft Windows is MiKTeX, available at www.miktex.org. Using LaTeX is a little difficult at first, so it is covered in the BIT committee's lunchtime seminars.
You
will need an X server. An X server is a program that allows you to
pass graphics from remote computers to your screen over secure
shell. Xming is a good free xserver for windows: http://www.straightrunning.com/XmingNotes/.
You
should download some software to read pdf files. The most popular
software for this purpose is Adobe's Acrobat Reader, which can be
found at Adobe's web site. In addition foxit reader: http://www.foxitsoftware.com/pdf/rd_intro.php has many of Acrobat's features and offers a free trial version. Sumatra: http://blog.kowalczyk.info/software/sumatrapdf/ is a very fast reader (that only does reading). Also, it is useful to be able to print to pdf files. PDFcreator http://sourceforge.net/projects/pdfcreator/ is a free pdf printer for windows.
Many students and faculty use the document preparation and presentation programs in Microsoft's Office suite. As an alternative, consider Openoffice, www.openoffice.org which is available for all of the popular operating systems, or even Google Docs. Spreadsheet programs, such as Microsoft's Xcel or Gnumeric, are generally insufficient for the advanced needs of real statisticians, and so these programs are less useful.
Asking questions
If you have any difficulties please email the BIT Committee bitsupport@jhsph.edu. For questions about how to use specific applications such as R or Emacs, you can email bithelp@jhsph.edu, which is read by both faculty and students in the department.
In addition, we want to emphasize that our system administrators, the faculty, the BIT committee and your fellow students are all available to help you get adjusted. Don't be afraid to ask any of us a question!
Sincerely,
The Biostatistics Information Technology Committee