views:

155

answers:

2

I used InetAddress to parse IP addresses, but now there is a need to store hostname if IP is unavailable. So I introduced a class Host.

case class Host(name:String, ip:InetAddress) {
    import Host.{addressToBytes, compareSeqs}
    override def toString:String = if (ip!=null) {ip.getHostName} else {name}
}

object Host {
    implicit def stringToPattern(s:String): Pattern = Pattern.compile(s)
    val separators = Seq[Pattern]("\\.", ":")
    def separatedStrToBytes(s:String, separator:Pattern): Array[Byte] = {
        val fields = separator.split(s)
        var rv = new Array[Byte](fields.length);
        fields.map(_.toInt.toByte).copyToArray(rv, 0)
        rv
    }
    implicit def strToBytes(s:String): Array[Byte] = {
        for (sep <- separators)
            if (sep.matcher(s).find())
                return separatedStrToBytes(s, sep)
        null
    }
    implicit def strToHost(s:String):Host = {
        var name = s
        var ip:InetAddress = null
        try {
            val bytes = strToBytes(s)
            if (bytes != null) { 
                ip = InetAddress.getByAddress(bytes)
//              println( "parsed ip: "+s)
            }
        } catch {
            case e:UnknownHostException =>
        }
        if (ip==null) {
            ip = InetAddress.getByName(s)
        }
        new Host(name, ip)
    }
}

With this change my software started to fail with "java.lang.OutOfMemoryError: GC overhead limit exceeded" in separatedStrToBytes. Have I made any memory handling mistakes here?

I appreciate any comments on design. I was unable to make parsing shorter due to need of Array[Byte] as InetAddress.getByAddress argument. Target platform has Scala 2.7.7.

EDIT: I've replaced parsing with dummies and found out that my program still fails a few megabytes of parsed data later elsewhere. Each replacement String.split(s:String) with Pattern.split(s:String) and precompiled pattern makes it run slightly longer. That doesn't solve my problem, but this question may be closed now. I still need design comments though.

+2  A: 

There's no need to manually parse URIs like this, simply use the pre-existing URI class from the standard Java library: http://download.oracle.com/javase/6/docs/api/java/net/URI.html

don't use the URL class though, under any circumstances. It has a crazy hashing algorithm that first resolves the hostname to an IP address, which is one of the main reasons why so many URL-using Java tools (such as the Eclipse update manager) are very slow to start up when you don't have a net connection

Kevin Wright
Good idea. How can I use URI to parse UDP endpoints? I can extract an IP string from endpoint, what to do next? I can't find a way to make use of custom scheme working with URI.
Basilevs
Examples of exactly what you want to parse would be a good starting point...
Kevin Wright
1288495632 17 10.3.0.1 138 10.3.255.255 138 461 eth0 unknown
Basilevs
+5  A: 

Your code runs just fine against 2.8.0 (you should consider migrating to it, as it's already final and quite stable) - no OutOfMemory detected.

Some optimizations to that you were asking for:

implicit def strToBytes(s:String)= (for {separator <- separators find(_.matcher(s).find)} yield separatedStrToBytes(s, separator)) getOrElse null

implicit def separatedStrToBytes(s:String, separator:Pattern) = s split separator.pattern map(Integer.parseInt(_).toByte)

scala> import Host._
import Host._
scala> strToBytes("127.0.0.1")
res9: Array[Byte] = Array(127, 0, 0, 1)
Vasil Remeniuk