Apr 172011
 

With the introduction of .Net and a new, modern framework library, developers understandably were very cheerful. A shiny new ecosystem with mint library designed without any backwards compatibility curses or legacy support. Finally, a library to take us into the future without second guessing. Well, those were the hopes and dreams of the often too-optimistic and naive.

However, if you’d grant me those simplistic titles, you’d understand my extreme disappointment when the compiler barfed on my AddRange call on HttpWebRequest with a long parameter. Apparently HttpWebRequest lacks 64-bit AddRange member.

Surely this was a small mistake, a relic from the .Net 1.0 era. Nope, it’s in 2.0 as well. Right then, I should be used 3.5. What’s wrong with me using 2.0, such an outdated version. Wrong again, I am using 3.5. But I need to resume downloads on 7GB+ files!

To me, this is truly a shocking goof. After all, .Net is supposed to be all about the agility and modernity that is the web. Three major releases of the framework and no one put a high-priority tag on this missing member? Surely my panic was exaggerated. It must be. There is certainly some simple workaround that everyone is using that makes this issue really a low-priority one.

Right then, the HttpWebRequest is a WebRequest, and I really don’t need a specialized function to set an HTTP header. Let’s set the header directly:

            HttpWebRequest request = WebRequest.Create(Uri) as HttpWebRequest;

            request.Headers["Range"] = "bytes=0-100";

To which, .Net responded with the following System.ArgumentException:

This header must be modified using the appropriate property.

Frustration! Luckily, somebody ultimately took notice of this glaring omission and added the AddRange(long, long); function to .Net 4.0.

So where does this leave us? Seems that I either have to move to .Net 4.0, write my own HttpWebRequest replacement or avoid large files altogether. Unless, that is, I find a hack.

Different solutions do exist to this problem on the web, but the most elegant one was this:

        /// <summary>
        /// Sets an inclusive range of bytes to download.
        /// </summary>
        /// <param name="request">The request to set the range to.</param>
        /// <param name="from">The first byte offset, -ve to omit.</param>
        /// <param name="to">The last byte offset, less-than from to omit.</param>
        private static void SetWebRequestRange(HttpWebRequest request, long from, long to)
        {
            string first = from >= 0 ? from.ToString() : string.Empty;
            string last = to >= from ? to.ToString() : string.Empty;

            string val = string.Format("bytes={0}-{1}", first, last);

            Type type = typeof(WebHeaderCollection);
            MethodInfo method = type.GetMethod("AddWithoutValidate", BindingFlags.Instance | BindingFlags.NonPublic);
            method.Invoke(request.Headers, new object[] { "Range", val });
        }

Since there were apparently copied pages with similar solutions, I’m a bit hesitant to give credit to any particular page or author in fear of giving credit to a plagiarizer. In return, I’ve improved the technique and put it into a flexible function. In addition, I’ve wrapped WebResponse into a reusable Stream class that plays better with non-network streams. In particular, my WebStream supports reading the Length and Position members and returns the correct results. Here is the full source code:

// --------------------------------------------------------------------------------------
// <copyright file="WebStream.cs" company="Ashod Nakashian">
// Copyright (c) 2011, Ashod Nakashian
// All rights reserved.
// 
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
// 
// o Redistributions of source code must retain the above copyright notice, 
// this list of conditions and the following disclaimer.
// o Redistributions in binary form must reproduce the above copyright notice, 
// this list of conditions and the following disclaimer in the documentation and/or
// other materials provided with the distribution.
// o Neither the name of the author nor the names of its contributors may be used to endorse
// or promote products derived from this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
// OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT 
// SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 
// INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
// LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// </copyright>
// <summary>
//   Wraps HttpWebRequest and WebResponse instances as Streams.
// </summary>
// --------------------------------------------------------------------------------------

namespace Web
{
    using System;
    using System.IO;
    using System.Net;
    using System.Reflection;

    /// <summary>
    /// HTTP Stream, wraps around HttpWebRequest.
    /// </summary>
    public class WebStream : Stream
	{
		public WebStream(string uri)
            : this(uri, 0)
		{
		}

        public WebStream(string uri, long position)
		{
			Uri = uri;
			position_ = position;
		}

        #region properties

        public string Uri { get; protected set; }
        public string UserAgent { get; set; }
        public string Referer { get; set; }

        #endregion // properties

        #region Overrides of Stream

        /// <summary>
        /// When overridden in a derived class, clears all buffers for this stream and causes any buffered data to be written to the underlying device.
        /// </summary>
        /// <filterpriority>2</filterpriority>
        public override void Flush()
        {
        }

        /// <summary>
        /// When overridden in a derived class, sets the position within the current stream.
        /// </summary>
        /// <returns>
        /// The new position within the current stream.
        /// </returns>
        /// <param name="offset">A byte offset relative to the <paramref name="origin"/> parameter.</param>
        /// <param name="origin">A value of type <see cref="T:System.IO.SeekOrigin"/> indicating the reference point used to obtain the new position.</param>
        /// <filterpriority>1</filterpriority>
        /// <exception cref="NotImplementedException"><c>NotImplementedException</c>.</exception>
        public override long Seek(long offset, SeekOrigin origin)
        {
            throw new NotImplementedException();
        }

        /// <summary>
        /// When overridden in a derived class, sets the length of the current stream.
        /// </summary>
        /// <param name="value">The desired length of the current stream in bytes.</param>
        /// <filterpriority>2</filterpriority>
        /// <exception cref="NotImplementedException"><c>NotImplementedException</c>.</exception>
        public override void SetLength(long value)
        {
            throw new NotImplementedException();
        }

        /// <summary>
        /// When overridden in a derived class, reads a sequence of bytes from the current stream and advances the position within the stream by the number of bytes read.
        /// </summary>
        /// <returns>
        /// The total number of bytes read into the buffer. This can be less than the number of bytes requested if that many bytes are not currently available, or zero (0) if the end of the stream has been reached.
        /// </returns>
        /// <param name="buffer">An array of bytes. When this method returns, the buffer contains the specified byte array with the values between <paramref name="offset"/> and (<paramref name="offset"/> + <paramref name="count"/> - 1) replaced by the bytes read from the current source.</param>
        /// <param name="offset">The zero-based byte offset in <paramref name="buffer"/> at which to begin storing the data read from the current stream.</param>
        /// <param name="count">The maximum number of bytes to be read from the current stream.</param>
        /// <filterpriority>1</filterpriority>
        /// <exception cref="System.ArgumentException">The sum of offset and count is larger than the buffer length.</exception>
        /// <exception cref="System.ArgumentNullException">buffer is null.</exception>
        /// <exception cref="System.ArgumentOutOfRangeException">offset or count is negative.</exception>
        /// <exception cref="System.NotSupportedException">The stream does not support reading.</exception>
        /// <exception cref="System.ObjectDisposedException">Methods were called after the stream was closed.</exception>
		public override int Read(byte[] buffer, int offset, int count)
		{
            if (stream_ == null)
            {
                Connect();
            }

            try
            {
                if (stream_ != null)
                {
                    int read = stream_.Read(buffer, offset, count);
                    position_ += read;
                    return read;
                }
            }
            catch (WebException)
            {
                Close();
            }
            catch (IOException)
            {
                Close();
            }

            return -1;
		}

        /// <summary>
        /// When overridden in a derived class, writes a sequence of bytes to the current stream and advances the current position within this stream by the number of bytes written.
        /// </summary>
        /// <param name="buffer">An array of bytes. This method copies <paramref name="count"/> bytes from <paramref name="buffer"/> to the current stream.</param>
        /// <param name="offset">The zero-based byte offset in <paramref name="buffer"/> at which to begin copying bytes to the current stream.</param>
        /// <param name="count">The number of bytes to be written to the current stream.</param>
        /// <filterpriority>1</filterpriority>
        /// <exception cref="NotImplementedException"><c>NotImplementedException</c>.</exception>
        public override void Write(byte[] buffer, int offset, int count)
        {
            throw new NotImplementedException();
        }

        /// <summary>
        /// When overridden in a derived class, gets a value indicating whether the current stream supports reading.
        /// Always returns true.
        /// </summary>
        /// <returns>
        /// true if the stream supports reading; otherwise, false.
        /// </returns>
        /// <filterpriority>1</filterpriority>
        public override bool CanRead
        {
            get { return true; }
        }

        /// <summary>
        /// When overridden in a derived class, gets a value indicating whether the current stream supports seeking.
        /// Always returns false.
        /// </summary>
        /// <returns>
        /// true if the stream supports seeking; otherwise, false.
        /// </returns>
        /// <filterpriority>1</filterpriority>
        public override bool CanSeek
        {
			get { return false; }
        }

        /// <summary>
        /// When overridden in a derived class, gets a value indicating whether the current stream supports writing.
        /// Always returns false.
        /// </summary>
        /// <returns>
        /// true if the stream supports writing; otherwise, false.
        /// </returns>
        /// <filterpriority>1</filterpriority>
        public override bool CanWrite
        {
			get { return false; }
        }

        /// <summary>
        /// When overridden in a derived class, gets the length in bytes of the stream.
        /// </summary>
        /// <returns>
        /// A long value representing the length of the stream in bytes.
        /// </returns>
        /// <exception cref="T:System.ObjectDisposedException">Methods were called after the stream was closed.</exception>
        /// <filterpriority>1</filterpriority>
        public override long Length
        {
            get { return webResponse_.ContentLength; }
        }

        /// <summary>
        /// When overridden in a derived class, gets or sets the position within the current stream.
        /// </summary>
        /// <returns>
        /// The current position within the stream.
        /// </returns>
        /// <filterpriority>1</filterpriority>
        /// <exception cref="NotSupportedException"><c>NotSupportedException</c>.</exception>
        public override long Position
        {
			get { return position_; }
			set { throw new NotSupportedException(); }
        }

        #endregion // Overrides of Stream

        #region operations

        /// <summary>
        /// Reads the full string data at the given URI.
        /// </summary>
        /// <returns>The full contents of the given URI.</returns>
        public static string ReadToEnd(string uri, string userAgent, string referer)
        {
            using (WebStream ws = new WebStream(uri, 0))
            {
                ws.UserAgent = userAgent;
                ws.Referer = referer;
                ws.Connect();

                using (StreamReader reader = new StreamReader(ws.stream_))
                {
                    return reader.ReadToEnd();
                }
            }
        }

        /// <summary>
        /// Writes the full data at the given URI to the given stream.
        /// </summary>
        /// <returns>The number of bytes written.</returns>
        public static long WriteToStream(string uri, string userAgent, string referer, Stream stream)
        {
            using (WebStream ws = new WebStream(uri, 0))
            {
                ws.UserAgent = userAgent;
                ws.Referer = referer;
                ws.Connect();

                long total = 0;
                byte[] buffer = new byte[64 * 1024];
                int read;
                while ((read = ws.stream_.Read(buffer, 0, buffer.Length)) > 0)
                {
                    stream.Write(buffer, 0, read);
                    total += read;
                }

                return total;
            }
        }

        #endregion // operations

        #region implementation

        protected override void Dispose(bool disposing)
        {
            base.Dispose(disposing);

            if (stream_ != null)
            {
                stream_.Dispose();
                stream_ = null;
            }
        }

        private void Connect()
        {
            Close();

            HttpWebRequest request = WebRequest.Create(Uri) as HttpWebRequest;
            if (request == null)
            {
                return;
            }

            request.UserAgent = UserAgent;
            request.Referer = Referer;
            if (position_ > 0)
            {
                SetWebRequestRange(request, position_, 0);
            }

            webResponse_ = request.GetResponse();
            stream_ = webResponse_.GetResponseStream();
        }

        /// <summary>
        /// Sets an inclusive range of bytes to download.
        /// </summary>
        /// <param name="request">The request to set the range to.</param>
        /// <param name="from">The first byte offset, -ve to omit.</param>
        /// <param name="to">The last byte offset, less-than from to omit.</param>
        private static void SetWebRequestRange(HttpWebRequest request, long from, long to)
        {
            string first = from >= 0 ? from.ToString() : string.Empty;
            string last = to >= from ? to.ToString() : string.Empty;

            string val = string.Format("bytes={0}-{1}", first, last);

            Type type = typeof(WebHeaderCollection);
            MethodInfo method = type.GetMethod("AddWithoutValidate", BindingFlags.Instance | BindingFlags.NonPublic);
            method.Invoke(request.Headers, new object[] { "Range", val });
        }

        #endregion // implementation

        #region representation

        private long position_;
        private WebResponse webResponse_;
        private Stream stream_;

        #endregion // representation
	}
}

I hope this saves someone some frustration and perhaps even time writing this handy class. Enjoy.

Share
Apr 102011
 

The WordPress plugin I use to manage my reading list is an updated version of the popular Now Reading plugin, called Now Reading Reloaded.

Original NRR book management page.

The original Now Reading plug was developed by Rob Miller who stopped maintaining it at around WP2.5. Luckily, Ben Gunnink at heliologue.com picked up where Rob left off and gave us Now Reading Reloaded (NRR). Unfortunately, it seems that the maintainer has no time to donate to the project and ceased maintaining it.

As one can see, the plugin is a great way to organize and share reading lists. This is exactly what I was looking for when I searched the WP plugin database. Up until then I was tracking my reading lists using simple lists in WP posts and pages. The database back-end of NRR gives it much more potential than one can hope to achieve with simple lists. This is not to say anything of the Amazon search-n-add feature with link and thumbnail of the book cover. All very welcome features.

Limitations

Unfortunately, NRR is far from perfect. One can run into it’s quirks multiple times during a single book update. But, for the price, I can’t complain. None of the issues were too big to annoy me enough to get my hands dirty debugging and patching the code. None, that is, except for a little obvious feature conspicuously missing. By the time I added most of the books I had in my lists and started updating what I read and adding current-reading items, I wished I didn’t have to jump from page to page until I found the book I had just finished to update its status. That is, I wished I could simply sort my library by status, started-reading or finished-reading dates. Even better, I wished by default the library showed me the latest books I started reading, which I was most probably interested in updating.

While open-source programs are great for too many reasons to list here, they all suffer from the same problem. Namely, the freedom to update the code also, by definition, forks and diverges it from the source. This immediately adds the overhead of merging any updates or bug-fixes added to the source to your hand-modified version. However, since it seems that NRR is no longer maintained, I had no reason to fear such a hustle and I still had all the reasons to get sorting functionality in the admin library page, also known as the Manage Books page.

Figuring out the code

First test with new sorting code.

First I had to figure out the wheres and the hows of the code. I wasn’t familiar with NRR, so I had some detective work ahead of me. First, I browsed the page in question: wp-admin/admin.php?page=manage_books. I browsed the pages and noticed how the URL was formed. Apparently, selecting the page to display is chosen by specifying a ‘p’ argument and the index of the page. For example, the second page would be /wp-admin/admin.php?page=manage_books&p=3. Next I had to find the PHP file responsible for generating this page. Since each plugin is stored in its own separate folder, first place to look in was the NRR plugin folder within my WP installation. On my WP3.1 installation, this folder is at /wp-content/plugins/now-reading-reloaded/admin. Within this folder a few PHP files exist with the “admin” prefix. The file “admin-manage.php” seems a reasonable start. I looked for any table construction code that resembles the Manage Books page, and sure enough it’s right there.

Patching

Page sorting works.

(Note: Breaking the PHP files may leave your WP in limbo. Make sure you have backup of your site before you make manual changes, and that you can replace broken files via ssh or ftp.)

The key to adding sorting is to figure out how the database query is generated. This, as it turns out, wasn’t very difficult to find. Within the admin-manage.php file the get_books function contains the complete query

$books = get_books("num=-1&status=all&orderby=status&order=desc{$search}{$pageq}{$reader}");

From this, it’s quite obvious which filter is responsible for what. The ‘orderby’ filter selects the sorting column, ‘order’ decides the direction of sorting and the rest are for searching, pagination etc. Instead of using hard-coded values, we need to get these values from the browser. Let’s add a couple of optional parameters to the page:

            if ( empty($_GET['o']) )
                $order = 'desc';
            else
                $order = urlencode($_GET['o']);

            if ( empty($_GET['s']) )
                $orderby = 'started';
            else
                $orderby = urlencode($_GET['s']);

So ‘o’ will decide the ‘order’ and ‘s’ the ‘orderby’ value. Before I move on, I have to test that this is working in practice and not just in theory. Manually loading the page with the newly added parameters give the expected results. /wp-admin/admin.php?page=manage_books&s=author&o=asc loads as expected the table of books, sorted by the author name in ascending order. Now, all we have to do is add links to the column headers and we can get a working page.

			if ( $order == 'asc' )
				$new_order = 'desc';
			else
				$new_order = 'asc';

			$book_link = "{$nr_url->urls['manage']}&p=$page&s=book&o=$new_order";
			$author_link = "{$nr_url->urls['manage']}&p=$page&s=author&o=$new_order";
			$added_link = "{$nr_url->urls['manage']}&p=$page&s=added&o=$new_order";
			$started_link = "{$nr_url->urls['manage']}&p=$page&s=started&o=$new_order";
			$finished_link = "{$nr_url->urls['manage']}&p=$page&s=finished&o=$new_order";
			$status_link = "{$nr_url->urls['manage']}&p=$page&s=status&o=$new_order";

            echo '
				<table class="widefat post fixed" cellspacing="0">
					<thead>
						<tr>
							<th></th>
							<th class="manage-column column-title"><a class="manage_books" href="'. $book_link .'">Book</a></th>
							<th class="manage-column column-author"><a class="manage_books" href="'. $author_link .'">Author</a></th>
							<th><a class="manage_books" href="'. $added_link .'">Added</a></th>
							<th><a class="manage_books" href="'. $started_link .'">Started</a></th>
							<th><a class="manage_books" href="'. $finished_link .'">Finished</a></th>
							<th><a class="manage_books" href="'. $status_link .'">Status</a></th>
						</tr>
					</thead>
					<tbody>
			';

Notice that I didn’t forget to invert the current sorting order in the link. This is pretty straight forward. Reloaded the page, tested the header links and sure enough all was well. One thing missing is the page selection links – they don’t obey the current sorting. The page selection links had to be patched as well:

                $pages .= " <a class='page-numbers prev' href='{$nr_url->urls['manage']}&p=$previous&s=$orderby&o=$order'>«</a>";
                $pages .= " <a class='page-numbers' href='{$nr_url->urls['manage']}&p=$i&s=$orderby&o=$order'>$i</a>";
                $pages .= " <a class='page-numbers next' href='{$nr_url->urls['manage']}&p=$next&s=$orderby&o=$order'>»</a>";

New default book management page view.

Download

Perfecto! Works great.

One last thing I thought would be useful was to reorder the columns, such that it’s “Book, Author, Status, Started, Finished, Added” instead of the default “Book, Author, Added, Started, Finished, Status”. Because frankly, I don’t care much when I added a book, I care most about its current status and when I started reading it. Once I’m done, I want to set the Finished date and move on to the next. After reordering the columns, I set the default sorting on the Started date and in descending order, as you can see in the screenshot.

Download NRR v5.1.3.2 with the sorting patch.

Share
Mar 282009
 

Recently I’ve been interested in web dev and web standards after disappearing from the scene for about a decade. Back in ’97 I was enjoying the ride of the web technologies, design, server and client-side programming, browser wars and what not. But all that had to end someday and that day came in mid 2000. I have a performance craze and wanted to dive deep into the world of desktop applications, servers and particularly highly scallable architecture, low-level (read: assembly) optimization and advanced algorithms. So I left the world wide web only to come back and rediscover it a decade later.

Finally interactive-web is here. Finally! Looking back at static HTML and how boring they looked, how hard and complicated it was just to get feedback from users and, even with that much primitive functionality, how incompatible web browsers were, gives me a shiver.

Now I have to cover a lot of ground. So much has changed and so much to catch up with. I have some experience with Python, honestly I love it. I’ve been using it on and off for a few years now. I thought I’d start with Python-based solutions. Some simple pythong http server (web.py). I do enjoy reading code, so I started by reading web.py. Fun stuff.

Django Tutorials? How about one better: Screen Casts! If reading code is good, reading working-site code should be better. Last, but never least, I had to get dirty with some code and created a toy Wiki in Django. I should say, the learning curve was rather smooth. Django has some quirks (queries) and some things must be done is a very specific way (projects/apps, models/views/urls …) but the thing that I find a bit annoying (especially during development) is the lack of support to import/create initial data (in the DB). The syncdb command is handly, but doesn’t work as expected when you change existing models. Probably works for some trivial cases, but add/remove some column, or mark one as index (or remove an index) and it will silently ignore you.

It’s come a long way since ’05 though. So I’ll cut it some slack. Plus, it’s probably the best python web framework. Not to mean that it’s perfected, far from it. Version 1.0 is in the very recent past (current stable version is 1.0.2 with 1.1 beta just released.)

Django and Python, what next? jQuery, JSON and Ajax. Then jump to CSS, Selectors and HTML 5.0. Not that they all work properly on all modern browsers, but exciting technologies nonetheless.

Share
QR Code Business Card