<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blue Cog Blog &#187; bash</title>
	<atom:link href="http://www.bluecog.com/blog/tag/bash/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.bluecog.com/blog</link>
	<description>It's just a freaking blue cog...</description>
	<lastBuildDate>Tue, 03 Aug 2010 19:32:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Tweaking the Bash Prompt</title>
		<link>http://www.bluecog.com/blog/2010/07/03/tweaking-the-bash-prompt/</link>
		<comments>http://www.bluecog.com/blog/2010/07/03/tweaking-the-bash-prompt/#comments</comments>
		<pubDate>Sat, 03 Jul 2010 13:24:30 +0000</pubDate>
		<dc:creator>Bill Melvin</dc:creator>
				<category><![CDATA[Computer User]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[bash]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[Ubuntu]]></category>

		<guid isPermaLink="false">http://www.bluecog.com/blog/?p=636</guid>
		<description><![CDATA[A little Saturday morning tweaking. Based on this post at railstips.org, I decided to adjust my Bash prompt by appending the following to my ~/.bashrc file: #... function parse_git_branch { ref=$(git symbolic-ref HEAD 2&#62; /dev/null) &#124;&#124; return echo &#34;(&#34;${ref#refs/heads/}&#34;)&#34; } BLACK=&#34;\[\033[0;30m\]&#34; BLUE=&#34;\[\033[0;34m\]&#34; VIOLET=&#34;\[\033[1;35m\]&#34; CYAN=&#34;\[\033[0;36m\]&#34; PS1=&#34;\n[$CYAN\u@\h:$BLUE\w$VIOLET \$(parse_git_branch)$BLACK]\n\$ &#34; The prompt will now show the name of [...]]]></description>
			<content:encoded><![CDATA[<p>A little Saturday morning tweaking.</p>
<p>Based on <a href="http://railstips.org/blog/archives/2009/02/02/bedazzle-your-bash-prompt-with-git-info/">this</a> post at railstips.org, I decided to adjust my Bash prompt by appending the following to my <strong>~/.bashrc</strong> file:</p>
<pre class="brush: bash">
#...

function parse_git_branch {
  ref=$(git symbolic-ref HEAD 2&gt; /dev/null) || return
  echo &quot;(&quot;${ref#refs/heads/}&quot;)&quot;
}

BLACK=&quot;\[\033[0;30m\]&quot;
BLUE=&quot;\[\033[0;34m\]&quot;
VIOLET=&quot;\[\033[1;35m\]&quot;
CYAN=&quot;\[\033[0;36m\]&quot;

PS1=&quot;\n[$CYAN\u@\h:$BLUE\w$VIOLET \$(parse_git_branch)$BLACK]\n\$ &quot;
</pre>
<p>The prompt will now show the name of the branch I am working in when the current directory is part of a Git repository. The original code used yellow, red, and green to highlight parts of the prompt. That messed with my mind when I ran RSpec and saw yellow and red when I was expecting all green. Rather than get used to it, I changed the colors. I also added some newlines to perhaps keep the command line neater when deep in a directory tree.</p>
<p style="text-align: center;"><a href="http://www.bluecog.com/blog/wp-content/uploads/2010/07/terminal_20100703_0909.png"><img class="size-medium wp-image-642 aligncenter" style="border: 1px solid black;" title="Terminal" src="http://www.bluecog.com/blog/wp-content/uploads/2010/07/terminal_20100703_0909-300x139.png" alt="Terminal screen shot" width="300" height="139" /></a></p>
<p><em>[Update 2010-07-23]</em><br />
After running with the above settings for a while I decided I don&#8217;t care for the colors in the prompt. Don&#8217;t need the square brackets either. I do like seeing the current git branch. That simplifies things a bit.</p>
<pre class="brush: bash">
#...

function parse_git_branch {
  ref=$(git symbolic-ref HEAD 2&gt; /dev/null) || return
  echo &quot;(&quot;${ref#refs/heads/}&quot;)&quot;
}

PS1=&quot;\n\u@\h:\w  \$(parse_git_branch)\n\$ &quot;
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.bluecog.com/blog/2010/07/03/tweaking-the-bash-prompt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pair Networks Database Backup Automation</title>
		<link>http://www.bluecog.com/blog/2009/11/10/pair-networks-database-backup/</link>
		<comments>http://www.bluecog.com/blog/2009/11/10/pair-networks-database-backup/#comments</comments>
		<pubDate>Tue, 10 Nov 2009 22:00:47 +0000</pubDate>
		<dc:creator>Bill Melvin</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Automation]]></category>
		<category><![CDATA[Backup]]></category>
		<category><![CDATA[bash]]></category>
		<category><![CDATA[FTP]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[sh]]></category>
		<category><![CDATA[Ubuntu]]></category>

		<guid isPermaLink="false">http://www.bluecog.com/blog/?p=403</guid>
		<description><![CDATA[I have a couple WordPress blogs, this being one of them, hosted at Pair Networks. I also have another non-blog site that uses a MySQL database. I have been doing backups of the databases manually through Pair&#8217;s Account Control Center (ACC) web interface on a somewhat regular basis, but it was bugging me that I [...]]]></description>
			<content:encoded><![CDATA[<p>I have a couple WordPress blogs, this being one of them, hosted at <a href="http://www.pair.com/" target="_blank">Pair Networks</a>. I also have another non-blog site that uses a MySQL database. I have been doing backups of the databases manually through Pair&#8217;s Account Control Center (ACC) web interface on a somewhat regular basis, but it was bugging me that I hadn&#8217;t automated it. I finally got around to doing so.</p>
<p>A search led to this <a href="http://www.bradtrupp.com/mysql-backup-cron.html" target="_blank">blog post</a> by Brad Trupp. He describes how to set up an automated database backup on a Pair Networks host. I used &#8220;technique 2&#8243; from his post as the basis for the script I wrote.</p>
<h3>Automating the Backup on the Pair Networks Host</h3>
<p>First I connected to my assigned server at Pair Networks using SSH (I use <a href="http://www.chiark.greenend.org.uk/~sgtatham/putty/" target="_blank">PuTTY</a> for that). There was already a directory named <strong>backup</strong> in my home directory where the backups done through the ACC were written. I decided to use that directory for the scripted backups as well.</p>
<p>In my home directory I created a shell script named <strong>dbbak.sh</strong>.</p>
<p><code>touch dbbak.sh</code></p>
<p>The script should have permissions set to make it private (it will contain database passwords) and executable.</p>
<p><code>chmod 700 dbbak.sh</code></p>
<p>I used the nano editor to write the script.</p>
<p><code>nano -w dbbak.sh</code></p>
<p>The script stores the current date and time (formatted as YYYYmmdd_HHMM) in a variable and then runs the mysqldump utility that creates the database backups. The resulting backup files are simply SQL text that will recreate the objects in a MySQL database and insert the data. The shell script I use backs up three different MySQL databases so the following example shows the same.</p>
<pre class="brush: bash">
#!/bin/sh

dt=`/bin/date +%Y%m%d_%H%M`

/usr/local/bin/mysqldump -hDBHOST1 -uDBUSERNAME1 -pDBPASSWORD1 USERNAME_DBNAME1 &gt; /usr/home/USERNAME/backup/dbbak_${dt}_DBNAME1.sql

/usr/local/bin/mysqldump -hDBHOST2 -uDBUSERNAME2 -pDBPASSWORD2 USERNAME_DBNAME2 &gt; /usr/home/USERNAME/backup/dbbak_${dt}_DBNAME2.sql

/usr/local/bin/mysqldump -hDBHOST3 -uDBUSERNAME3 -pDBPASSWORD3 USERNAME_DBNAME3 &gt; /usr/home/USERNAME/backup/dbbak_${dt}_DBNAME3.sql
</pre>
<p>Substitute these tags in the above example with your database and account details:</p>
<ul>
<li><strong>DBHOST</strong> is the database server, such as db24.pair.com.</li>
<li><strong>DBUSERNAME</strong><em>n</em> is the full access username for the database.</li>
<li><strong>DBPASSWORD</strong><em>n</em> is the password for that database user.</li>
<li><strong>USERNAME_DBNAME</strong><em>n</em> is the full database name that has the account user name as the prefix. </li>
<li><strong>USERNAME</strong> is the Pair Networks account user name.</li>
<li><strong>DBNAME</strong><em>n</em> is the database name without the account user name prefix.</li>
</ul>
<p>Once the script was written and tested manually on the host, I used the ACC (Advanced Features / Manage Cron jobs) to set up a cron job to run the script daily at 4:01 AM.</p>
<h3>Automating Retrieval of the Backup Files</h3>
<p>It was nice having the backups running daily without any further work on my part but, if I wanted a local copy of the backups, I still had to download them manually. Though <a href="http://filezilla-project.org/" target="_blank">FileZilla</a> is easy to use, downloading files via FTP seemed like a prime candidate for automation as well. I turned to Python for that. Actually I turned to an excellent book that has been on my shelf for a few years now, <a href="http://www.amazon.com/gp/product/1590593715?ie=UTF8&#038;tag=bluecog-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=1590593715">Foundations of Python Network Programming</a><img src="http://www.assoc-amazon.com/e/ir?t=bluecog-20&#038;l=as2&#038;o=1&#038;a=1590593715" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> by John Goerzen. Using the <strong>ftplib</strong> examples in the book as a foundation, I created a Python script named <strong>getdbbak.py</strong> to download the backup files automatically. </p>
<pre class="brush: python">
#!/usr/bin/env python
# getdbbak.py

from ftplib import FTP
from datetime import datetime
from DeleteList import GetDeleteList
import os, sys
import getdbbak_email

logfilename = &#039;getdbbak-log.txt&#039;
msglist = []

def writelog(msg):
    scriptdir = os.path.dirname(sys.argv[0])
    filename = os.path.join(scriptdir, logfilename)
    logfile = open(filename, &#039;a&#039;)
    logfile.writelines(&quot;%s\n&quot; % msg)
    logfile.close()

def say(what):
    print what
    msglist.append(what)
    writelog(what)

def retrieve_db_backups():
    host = sys.argv[1]
    username = sys.argv[2]
    password = sys.argv[3]
    local_backup_dir = sys.argv[4]

    say(&quot;START %s&quot; % datetime.now().strftime(&#039;%Y-%m-%d %H:%M&#039;))
    say(&quot;Connect to %s as %s&quot; % (host, username))

    f = FTP(host)
    f.login(username, password)

    ls = f.nlst(&quot;dbbak_*.sql&quot;)
    ls.sort()
    say(&quot;items = %d&quot; % len(ls))
    for filename in ls:
        local_filename = os.path.join(local_backup_dir, filename)
        if os.path.exists(local_filename):
            say(&quot;(skip) %s&quot; % local_filename)
        else:
            say(&quot;(RETR) %s&quot; % local_filename)
            local_file = open(local_filename, &#039;wb&#039;)
            f.retrbinary(&quot;RETR %s&quot; % filename, local_file.write)
            local_file.close()

    date_pos = 6
    keep_days = 5
    keep_weeks = 6
    keep_months = 4
    del_list = GetDeleteList(ls, date_pos, keep_days, keep_weeks, keep_months)
    if len(del_list) &gt; 0:
        if len(ls) - len(del_list) &gt;= keep_days:
            for del_filename in del_list:
                say(&quot;DELETE %s&quot; % del_filename)
                f.delete(del_filename)
        else:
            say(&quot;WARNING: GetDeleteList failed sanity check. No files deleted.&quot;)

    f.quit()
    say(&quot;FINISH %s&quot; % datetime.now().strftime(&#039;%Y-%m-%d %H:%M&#039;))
    getdbbak_email.SendLogMessage(msglist)

if len(sys.argv) == 5:
    retrieve_db_backups()
else:
    print &#039;USAGE: getdbbak.py Host User Password LocalBackupDirectory&#039;
</pre>
<p>This script runs via cron on a PC running Ubuntu 8.04 LTS that I use as a local file/subversion/trac server. The script does a bit more than just download the files. It deletes older files from the host based on rules for number of days, weeks, and months to keep. It also writes some messages to a log file and sends an email with the current session&#8217;s log entries.</p>
<p>To set up the cron job in Ubuntu I opened a terminal and ran the following command to edit the crontab file:</p>
<p><code>crontab -e</code></p>
<p>The crontab file specifies commands to run automatically at scheduled times. I added an entry to the crontab file that runs a script named <strong>getdbbak.sh</strong> at 6 AM every day. Here is the crontab file:</p>
<pre class="brush: bash">
MAILTO=&quot;&quot; 

# m h dom mon dow command 

0 6 * * * /home/bill/GetDbBak/getdbbak.sh
</pre>
<p>The first line prevents cron from sending an email listing the output of any commands cron runs. The getdbbak.py script will send its own email so I don&#8217;t need one from cron. I can always enable the cron email later if I want to see that output to debug a failure in a script cron runs.</p>
<p>Here is the getdbbak.sh shell script that is executed by cron:</p>
<pre class="brush: bash">
#!/bin/bash 

/home/bill/GetDbBak/getdbbak.py FTP.EXAMPLE.COM USERNAME PASSWORD /mnt/data2/files/Backup/PairNetworksDb
</pre>
<p>This shell script runs the getdbbak.py Python script and passes the FTP login credentials and the destination directory for the backup files as command line arguments. </p>
<p>As I mentioned, the getdbbak.py script deletes older files from the host based on rules. The call to <strong>GetDeleteList</strong> returns a list of files to delete from the host. That function is implemented in a separate module, <strong>DeleteList.py</strong>:</p>
<pre class="brush: python">
#!/usr/bin/env python
# DeleteList.py

from datetime import datetime
import KeepDateList

def GetDateFromFileName(filename, datePos):
    &quot;&quot;&quot;Expects filename to contain a date in the format YYYYMMDD starting
       at position datePos.
    &quot;&quot;&quot;
    try:
        yr = int(filename[datePos : datePos + 4])
        mo = int(filename[datePos + 4 : datePos + 6])
        dy = int(filename[datePos + 6 : datePos + 8])
        dt = datetime(yr, mo, dy)
        return dt
    except:
        return None

def GetDeleteList(fileList, datePos, keepDays, keepWeeks, keepMonths):
    dates = []
    for filename in fileList:
        dt = GetDateFromFileName(filename, datePos)
        if dt != None:
            dates.append(dt)
    keep_dates = KeepDateList.GetDatesToKeep(dates, keepDays, keepWeeks, keepMonths)
    del_list = []
    for filename in fileList:
        dt = GetDateFromFileName(filename, datePos)
        if (dt != None) and (not dt in keep_dates):
                del_list.append(filename)
    return del_list
</pre>
<p>That module in turn uses the function <strong>GetDatesToKeep</strong> defined in the module <strong>KeepDateList.py</strong> to decide which files to keep on order to maintain the desired days, weeks, and months of backup history. If a file&#8217;s name contains a date that&#8217;s not in the list of dates to keep then it goes in the list of files to delete.</p>
<pre class="brush: python">
#!/usr/bin/env python
# KeepDateList.py

from datetime import datetime

def ListHasOnlyDates(listOfDates):
    dt_type = type(datetime(2009, 11, 10))
    for item in listOfDates:
        if type(item) != dt_type:
            return False
    return True

def GetUniqueSortedDateList(listOfDates):
    if len(listOfDates) &lt; 2:
        return listOfDates
    listOfDates.sort()
    result = [listOfDates[0]]
    last_date = listOfDates[0].date()
    for i in range(1, len(listOfDates)):
        if listOfDates[i].date() != last_date:
            last_date = listOfDates[i].date()
            result.append(listOfDates[i])
    return result

def GetDatesToKeep(listOfDates, daysToKeep, weeksToKeep, monthsToKeep):
    if daysToKeep &lt; 1:
        raise ValueError(&quot;daysToKeep must be greater than zero.&quot;)
    if weeksToKeep &lt; 0:
        raise ValueError(&quot;weeksToKeep must not be less than zero.&quot;)
    if monthsToKeep &lt; 0:
        raise ValueError(&quot;monthsToKeep must not be less than zero.&quot;)

    if not ListHasOnlyDates(listOfDates):
        raise TypeError(&quot;List must only contain items of type &#039;datetime.datetime&#039;.&quot;)

    dates = GetUniqueSortedDateList(listOfDates)    

    tail = len(dates) - 1
    keep = [dates[tail]]
    days_left = daysToKeep - 1
    while (days_left &gt; 0) and (tail &gt; 0):
        tail -= 1
        days_left -= 1
        keep.append(dates[tail])

    year, week_number, weekday = dates[tail].isocalendar()
    weeks_left = weeksToKeep
    while (weeks_left &gt; 0) and (tail &gt; 0):
        tail -= 1
        yr, wn, wd = dates[tail].isocalendar()
        if (wn &lt;&gt; week_number) or (yr &lt;&gt; year):
            weeks_left -= 1
            year, week_number, weekday = dates[tail].isocalendar()
            keep.append(dates[tail])

    month = dates[tail].month
    year = dates[tail].year
    months_left = monthsToKeep
    while (months_left &gt; 0) and (tail &gt; 0):
        tail -= 1
        if (dates[tail].month &lt;&gt; month) or (dates[tail].year &lt;&gt; year):
            months_left -= 1
            month = dates[tail].month
            year = dates[tail].year
            keep.append(dates[tail])

    return keep
</pre>
<p>I also put the function <strong>SendLogMessage</strong> that sends the session log via email in a separate module, <strong>getdbbak_email.py</strong>:</p>
<pre class="brush: python">
#!/usr/bin/env python
# getdbbak_email.py

from email.MIMEText import MIMEText
from email import Utils
import smtplib

def SendLogMessage(msgList):
    from_addr = &#039;atest@bogusoft.com&#039;
    to_addr = &#039;wm.melvin@gmail.com&#039;
    smtp_server = &#039;localhost&#039;

    message = &quot;&quot;
    for s in msgList:
        message += s + &quot;\n&quot;

    msg = MIMEText(message)
    msg[&#039;To&#039;] = to_addr
    msg[&#039;From&#039;] = from_addr
    msg[&#039;Subject&#039;] = &#039;Download results&#039;
    msg[&#039;Date&#039;] = Utils.formatdate(localtime = 1)
    msg[&#039;Message-ID&#039;] = Utils.make_msgid()

    smtp = smtplib.SMTP(smtp_server)
    smtp.sendmail(from_addr, to_addr, msg.as_string())
</pre>
<p>Here is a ZIP file containing the set of Python scripts, including some unit tests (such as they are) for the file deletion logic: <a href="http://www.bogusoft.com/files/public/GetDbBak.zip">GetDbBak.zip</a></p>
<p>I hope this may be useful to others with a similar desire to automate MySQL database backups and FTP transfers who haven&#8217;t come up with their own solution yet. Even if you don&#8217;t use Pair Networks as your hosting provider some of the techniques may still apply. I&#8217;m still learning too so if you find mistakes or come up with improvements to this solution, please let me know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bluecog.com/blog/2009/11/10/pair-networks-database-backup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How I Split Podcast Files</title>
		<link>http://www.bluecog.com/blog/2009/04/24/how-i-split-podcast-files/</link>
		<comments>http://www.bluecog.com/blog/2009/04/24/how-i-split-podcast-files/#comments</comments>
		<pubDate>Fri, 24 Apr 2009 16:01:12 +0000</pubDate>
		<dc:creator>Bill Melvin</dc:creator>
				<category><![CDATA[Works for me]]></category>
		<category><![CDATA[bash]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[mp3]]></category>
		<category><![CDATA[Ubuntu]]></category>

		<guid isPermaLink="false">http://www.bluecog.com/blog/?p=194</guid>
		<description><![CDATA[Note: This is a &#8220;How-I&#8221; (works for me) not a &#8220;How-to&#8221; (do as I say) post. I do goofy stuff sometimes. For example, I use Linux to download a couple podcasts targeted to Microsoft Windows developers. Specifically, I use µTorrent (that&#8217;s the &#8220;Micro&#8221; symbol so the name is pronounced &#8220;MicroTorrent&#8221;), a Windows BitTorrent client, running [...]]]></description>
			<content:encoded><![CDATA[<p><em>Note: This is a &#8220;How-I&#8221; (works for me)  not a &#8220;How-to&#8221; (do as I say) post.</em></p>
<p>I do goofy stuff sometimes. For example, I use Linux to download a couple podcasts targeted to Microsoft Windows developers. Specifically, I use <a href="http://www.utorrent.com/">µTorrent</a> (that&#8217;s the &#8220;Micro&#8221; symbol so the name is pronounced &#8220;MicroTorrent&#8221;), a Windows <a href="http://en.wikipedia.org/wiki/BitTorrent_(protocol)">BitTorrent</a> client, running in <a href="http://www.winehq.org/">Wine</a> on Ubuntu to download the <a href="http://www.dotnetrocks.com/">.NetRocks</a> and <a href="http://www.hanselminutes.com/">Hanselminutes</a> podcasts. I&#8217;ve had no problems running µTorrent in Wine. I got started doing this because my mp3 player was awkward to work with in Windows XP.</p>
<p>When I connected my Sansa m250 mp3 player to a Windows XP box, the driver software XP loaded wanted me to interact with the mp3 player as a media device. It has been a while, and I can&#8217;t recall exactly what it did, but I do recall it wanted me to use a media library application (one that would probably try to enforce <a href="http://en.wikipedia.org/wiki/Digital_rights_management">DRM</a> restrictions) and did not give me direct access to the file system on the player. There is probably a way around that, but I didn&#8217;t find it quickly at the time. What I did find was that when I connected the mp3 player to my old PC running Ubuntu it detected it and mounted it as a file system device that I could happily copy mp3 files to as I pleased. Good enough for me.</p>
<p>At first I was using the <a href="http://azureus.sourceforge.net/">Azureus</a> BitTorrent client, which is a Java app and runs on Ubuntu, to download the podcasts (and an occasional <a href="http://distrowatch.com/">distro</a> to play with). That application seemed to get more bloated with each release. It started displaying a bunch of flashy stuff and promoting things that you probably shouldn&#8217;t be downloading (but it&#8217;s okay if you <a href="http://www.thelocal.se/18954/20090419/">don&#8217;t believe in copyright</a>). I read about µTorrent and tried it on a Windows XP PC. It&#8217;s a lightweight program that does BitTorrent well without promoting piracy (personally, I do think copyright, <a href="http://wiki.lessig.org/index.php/Against_perpetual_copyright">with limits</a>, is a good thing). While this worked well for downloading, I didn&#8217;t like the extra step of copying files from the PC running Windows to the other running Ubuntu to load them onto my mp3 player. After reading a timely article about Wine (the source of the article escapes me now), I decided to try running µTorrent using Wine. I don&#8217;t recall having any problems <a href="https://help.ubuntu.com/community/Wine">setting it up</a>, it just worked. I did have to fiddle with my router to set up <a href="http://wiki.vuze.com/index.php/Port_forwarding">port forwarding</a> but that&#8217;s not related to Wine or Ubuntu, just something you may have to do for BitTorrent to work.</p>
<p>This method of downloading the podcasts works well, but that&#8217;s not the end of the story. Occasionally I would be part way through a podcast and, for some reason (maybe I was trying to rewind a little bit within the file but my finger slipped and it went back to the beginning of the file), I would have to fast-forward to where I left off. Hour-long podcasts in a single mp3 file are not easy to fast forward with the Sansa player I have. It doesn&#8217;t <em>forward faster</em> the longer you hold the button like some devices do, it just goes at the same (painfully slow for a large file) pace. It seemed like splitting the mp3 files into sections would make that sort of thing easier. Bet there&#8217;s an app for that.</p>
<p>A search of the Ubuntu application repository turned up <a href="http://mp3splt.sourceforge.net/">mp3splt</a>. It has a GUI but I only wanted the command line executable which is available in the repository and can be installed from the command line (note that there&#8217;s no &#8220;i&#8221; in <em>mp3splt</em>):</p>
<p><code>sudo apt-get install mp3splt</code></p>
<p>After a couple trips to the <a href="http://mp3splt.sourceforge.net/mp3splt_page/documentation/man.html">man page </a>to sort out which command line arguments to use, I had it splitting big mp3 files into sections in smaller mp3 files. That worked for splitting the files but I found that the player didn&#8217;t put those files in order when playing back. That’s not acceptable. I probably could just make a playlist file and use that to get the sections to play in order. I wondered if setting the ID3 tags in a way that numbered the tracks would make the player play them in order. Turns out it would. A search for &#8220;ID3&#8243; in the Ubuntu repository led to <a href="http://packages.ubuntu.com/hardy/id3tool">id3tool</a>, a simple utility for editing the ID3 tags in mp3 files. I installed it too:</p>
<p><code>sudo apt-get install id3tool</code></p>
<p>I wrote a shell script named <strong>podsplit.sh</strong> to put this splitting apart all together. I use a specific directory to hold the mp3 files I want to split (but I’ll call it a “folder” since that’s the GUI metaphor, and I use the GNOME GUI to move the files around). I manually copy the downloaded mp3 files into the <strong>2Split</strong> folder and then open a terminal and run the script. The script creates a sub-folder for each mp3 file that is split. When the script is finished I copy the sub-folders containing the resulting smaller mp3 files to the Sansa mp3 player. </p>
<p>Here&#8217;s the shell script:</p>
<pre class="brush: bash">
#!/bin/bash

#------------------------------------------------------------
# podsplit.sh
#
# by Bill Melvin (bogusoft.com)
#
# BASH script for splitting mp3 podcasts into smaller pieces.
# I want to do this because it takes &quot;forever&quot; to fast-
# forward or rewind in a huge mp3 on my Sansa player.
#
# This script requires mp3splt and id3tool.
#
# This script, being a personal-use one-off utility, also
# assumes some things:
# 1. mp3 files to be split are placed in ~/2Split
# 2. The file names are in the format showname_0001.mp3
#    or showname_0001_morestuff.mp3 where 0001 is the
#    episode number.
#
# I&#039;m no nix wiz and I don&#039;t write many shell scripts so
# this script also echoes a bunch of stuff so I can see
# what&#039;s going on.
#
#------------------------------------------------------------
# [2009-01-18] First version.
#
# [2009-01-24] Use abbreviated show name for Artist.
#
# [2009-02-12] Changed split time from 3.0 to 5.0.
#
# [2009-02-16] Use track number instead of end-time in track
# title.
#
# [2009-02-19] Redirect some output to log file.
#------------------------------------------------------------

split_home=~/2Split
logfn=&quot;${split_home}/podsplit-log.txt&quot;

ChangeID3() {
  filepath=$1
  filename=$2

  # Get track number from ID3.
  temp=`id3tool &quot;$filepath&quot; | grep Track: | cut -c9-`

  # Zero-pad to length of 3 characters.
  track=`printf &quot;%03d&quot; $temp`

  # Extract the name of the show and the episode number from
  # the file name. This only works if the file naming follows
  # the convention showname_0001_morestuff.mp3 where 0001
  # is the episode number. The file name is split into fields
  # delimited by the underscore character.
  show=`echo $filename | cut -d&#039;_&#039; -f1`
  episode=`echo $filename | cut -d&#039;_&#039; -f2`
  abbr=&quot;${show:0:6}&quot;
  album=&quot;${abbr}_${episode}&quot;
  title=&quot;${abbr}_${episode}_${track}&quot;

  echo &quot;ChangeID3&quot;
  echo &quot;filepath = $filepath&quot; &gt;&gt; $logfn
  echo &quot;filename = $filename&quot; &gt;&gt; $logfn
  echo &quot;show = $show&quot; &gt;&gt; $logfn
  echo &quot;abbr = $abbr&quot; &gt;&gt; $logfn
  echo &quot;episode = $episode&quot; &gt;&gt; $logfn
  echo &quot;album = $album&quot; &gt;&gt; $logfn
  echo &quot;title = $title&quot; &gt;&gt; $logfn
  echo &quot;track = $track&quot; &gt;&gt; $logfn
  echo &quot;BEFORE&quot; &gt;&gt; $logfn
  id3tool &quot;$filepath&quot; &gt;&gt; $logfn

  id3tool --set-album=&quot;$album&quot; --set-artist=&quot;$abbr&quot; --set-title=&quot;$title&quot; &quot;$1&quot;

  echo &quot;AFTER&quot; &gt;&gt; $logfn
  id3tool &quot;$filepath&quot; &gt;&gt; $logfn
}

SplitMP3() {
  echo &quot;SplitMP3&quot;
  name1=$1
  echo &quot;name1 = $name1&quot;

  # Get file name and extension without directory path.
  name2=${name1#$split_home/}
  echo &quot;name2 = $name2&quot;

  # Get just the file name without the extension.
  name3=${name2%.mp3}
  echo &quot;name3 = $name3&quot;

  outdir=$split_home/$name3.split
  echo &quot;Create $outdir&quot;
  mkdir &quot;$outdir&quot;

  mp3splt -a -t 5.0 -d &quot;$outdir&quot; -o @t_@n $1

  for MP3 in $outdir/*.mp3
  do
    ChangeID3 &quot;$MP3&quot; &quot;$name3&quot;
  done
}

for FN in $split_home/*.mp3
do
  SplitMP3 &quot;$FN&quot;
done

echo &quot;Done.&quot;
</pre>
<p>This is not a flexible script as my folder for splitting files is hard-coded and it assumes a file naming convention for the mp3 files being split. If you’re an experienced shell scripter I’m sure you can do better. I still consider myself a Linux &#8220;noob&#8221; (and offer proof as well), intermediate in some areas at best. I am posting this because someone else may be trying to solve a similar problem and this can serve as an example of what worked for one person, in one situation, to work around the limitations of one particular mp3 player. Someone less goofy would probably just buy an iPod and use iTunes to handle the podcast files.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.bluecog.com/blog/2009/04/24/how-i-split-podcast-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
