Thursday, January 8, 2015

I need help sorting out the files

I was going to write a post about what I just learned from Scott about file quality (this would duplicate the info in his post but make sure I understood it), and then pull out a few reads from the problem sample he noticed and BLAST them to see if they align to anything.

But I've discovered that first I need to get the various files sorted out and properly named.  I have the fastq files from the sequencing centre on a little USB backup drive plugged into my laptop, but these files have the original confusing names, not the new informative names.

I also have a nice table made by Josh that shows both names for each sample, so in principle I could just carefully rename my files.

But issues:
  1. There are 72 files and that's a lot of opportunities for error.  If Scott has the renamed files on the Zoology cluster, he could put a copy of them into my directory there, and then I could download them onto my backup drive.  That would be inefficient but safe.
  2. Should I really be working with files on the Zoology cluster, rather than local files?
  3. I want to get the 24 RNAseq files from last year too.
While waiting for advice I think I'll go ahead and use Josh's table to identify the original name of the problem sample (new name antx_M2_C, old name Sample_9C30), and pull out the reads and do those BLAST searches.

2 comments:

Scott M. said...

After I'm done re-aligning the old files, I'll have all of them renamed and on an external hard drive. You can either transfer them that way or we can figure out a way to transfer the files over the zoology server. I don't recommend renaming the files yourself.

Rosie Redfield said...

Once you have them on your hard drive we can quickly copy them to mine (20 min). I agree that this will be a lot safer than me renaming my files.

p.s. Love the reCaptcha!