This project is read-only.

Lazy Initialization of Known Coordinate Systems

Apr 19, 2012 at 4:09 PM

I was profiling my application and noticed that a fair amount of time is spent during Map initialization creating the Known Coordinate Systems. Every known coordinate system is created.  This also increases the memory used by the application.  Some applications may require DotSpatial.Projections.dll but may not need KnownCoordinateSystems.    I played around a bit with varying degrees of just-in-time creation.  I tried 2 approaches, both of which appeared to work.  I outline them below.  I would like to propose we at least implement the 1st approach since it is a simple change.  If the 2nd approach is used, the 1st approach is not needed.

 

Approach 1:

. I made a simple change to the static class KnownCoordinateSystems from this:

 

    public static class KnownCoordinateSystems
    {
        /// <summary>
        /// Geographic systems operate in angular units, but can use different
        /// spheroid definitions or angular offsets.
        /// </summary>
        public static GeographicSystems Geographic = new GeographicSystems();

        /// <summary>
        /// Projected systems are systems that use linear units like meters or feet
        /// rather than angular units like degrees or radians
        /// </summary>
        public static ProjectedSystems Projected = new ProjectedSystems();
    }

 

to this:

 

    public static class KnownCoordinateSystems
    {
        private static GeographicSystems _geographic;
        private static ProjectedSystems _projected;

        /// <summary>
        /// Projected systems are systems that use linear units like meters or feet
        /// rather than angular units like degrees or radians
        /// </summary>
        public static ProjectedSystems Projected
        {
            get { return _projected ?? (_projected = new ProjectedSystems()); }
        }

        /// <summary>
        /// Geographic systems operate in angular units, but can use different
        /// spheroid definitions or angular offsets.
        /// </summary>
        public static GeographicSystems Geographic
        {
            get { return _geographic ?? (_geographic = new GeographicSystems()); }
        }
    }
This change delays any creation of the Projected or Geographic systems until they are actually accessed.  This speeds up initialization of any applications that don't care about KnownCoordinateSystems.  But, once an application accesses either Projected or Geographic, all categories (and their child coordinate systems) are at that point created, thus using memory for many many unused coordinate systems.
2. The next thing I tried was just-in-time creation of categories in the ProjectedSystems and GeographicSystems classes.  This was a little more work, but appeared to work. I won't show all the changes, but here are the representative ones:
The old code was:
namespace DotSpatial.Projections
{
    /// <summary>
    /// Projected
    /// </summary>
    public class ProjectedSystems
    {
        private string[] _names;

        #region Fields

        public readonly Africa Africa;
        // and so on...
        public ProjectedSystems()
        {
            Africa = new Africa();
            // and so on...
        }


        private void AddNames()
        {
            FieldInfo[] flds = GetType().GetFields(BindingFlags.Public | BindingFlags.Instance);
            _names = new string[flds.Length];
            for (int i = 0; i < flds.Length; i++)
            {
                _names[i] = flds[i].Name;
            }
        }

        /// <summary>
        /// Given the string name, this will return the specified coordinate category
        /// </summary>
        /// <param name="name"></param>
        /// <returns></returns>
        public CoordinateSystemCategory GetCategory(string name)
        {
            FieldInfo[] flds = GetType().GetFields(BindingFlags.Public | BindingFlags.Instance);
            for (int i = 0; i < flds.Length; i++)
            {
                if (flds[i].Name == name)
                {
                    return flds[i].GetValue(this) as CoordinateSystemCategory;
                }
            }
            return null;
        }
    }
}

The new code is:
namespace DotSpatial.Projections
{
    /// <summary>
    /// Projected
    /// </summary>
    public class ProjectedSystems
    {
        private string[] _names;

        #region Fields

        private Africa _africa;
        // and so on...
        public Africa Africa
        {
            get { return _africa ?? (_africa = new Africa()); }
        }
        // and so on...
        private void AddNames()
        {
            PropertyInfo[] properties = GetType().GetProperties(BindingFlags.Public | BindingFlags.Instance);
            _names = (from property in properties where property.Name != "Names" select property.Name).ToArray();
        }
        public CoordinateSystemCategory GetCategory(string name)
        {
            PropertyInfo property = GetType().GetProperty(name, BindingFlags.Public | BindingFlags.Instance);
            if (null == property)
                return null;
            return property.GetValue(this, null) as CoordinateSystemCategory;
        }
    }
}
With this 2nd approach, a category is created just-in-time, thus only requiring memory for the child coordinate systems in that category. However, I had to change the way we use Reflection to get the Names and Categories from using FieldInfo to PropertyInfo. This could be taken even further to just-in-time creation of individual coordinate systems in a category, but that would be much more tedious.

 

Apr 19, 2012 at 4:58 PM

I think the 1st approach does not buy us too much since the default Map uses KnownCoordinateSystems.World.Wgs84

Kyle

Apr 19, 2012 at 6:08 PM

I'm not opposed to either of these.

I would expect at some point it would be more useful to put all of the projections in a database.

Apr 19, 2012 at 6:19 PM

I'll create an issue and assign myself to it. Along the lines of the database, we also need to move some of the grids out of the Projections DLL and store those externally as well.  This would greatly reduce the size of the DLL.  We may already have an issue for that.

Apr 19, 2012 at 9:45 PM

I don't think there is an issue created for that, but we agree.

Apr 19, 2012 at 9:53 PM

If the issue to externalize the grids existed, it was definitely pre-TFS.

Apr 20, 2012 at 11:31 AM

There is Lazy class in .Net4 http://msdn.microsoft.com/en-us/library/dd642331.aspx

So your code will be:

private static Lazy<ProjectedSystems> _projected = new Lazy<ProjectedSystems>(() => new ProjectedSystems(), true);

public static ProjectedSystems Projected
{
get { return _projected.Value; }
 }

Apr 25, 2012 at 8:46 PM

I like the idea of making a change like this. Currently ProjectionInfo.ParseProj4String emits warnings about a few strange projections during initialization. The second suggestion or something like it will avoid these in addition to the memory and speed savings.

Lazy sounds like an interesting approach that might simplify things but I am not sure it is any better than just turning fields into self-initializing properties.

I am tempted to push this a step further and see if we could either use Lazy or self-initializing properties for each of the ProjectionInfo objects within a CoordinateSystemCategory instead of initializing them all in each CoordinateSystemCategory constructor. I can see that it would be a bit more work to do this and agree that the second approach above is worth doing whether or not we push a step further.

I have gone ahead and developed a little custom refactoring code that converts ProjectedCategories/Africa.cs to the version that can be found at:

http://hspf.com/pub/D4EM/Africa.cs

It would hopefully not be hard to run this refactoring on all of the CoordinateSystemCategory classes.

Should I do this? If so, are there any changes I should make in the refactoring algorithm first?

My refactoring code for this is available in http://hspf.com/pub/D4EM/Refactor.zip (I am using VB.Net to refactor C#, feel free to call me strange.)

Apr 25, 2012 at 10:48 PM

@vatavian,

I've reviewed your changes and don't see any problems. If you really want to see what happens with these changes, I would recommend creating a couple of test cases and profiling the new version and old versions to make sure the speed and memory improvements are what you would expect them to be.

Apr 26, 2012 at 4:52 PM

vatavian,

I didn't look at the VB code but what you did to Africa looks fine to me.  I agree with you that the Lazy class would be fine but did not seem to add a whole lot.  I also agree that taking it all the way down to individual ProjectionInfo objects would be ideal.  I just didn't want to do it by hand.  But, if you have written some VB to refactor, go for it!  I created an issue (http://dotspatial.codeplex.com/workitem/22520).  Feel free to change the assigned developer to you and run with it.  I'm really covered up with other work right now, so if you want to do it, that would be great and appreciated.

Thanks,

Kyle

May 2, 2012 at 9:34 PM

This issue has just gotten elevated to a high priority by my users.  So, I will likely be checking in enhancements for the ProjectedCategories and GeographicCategories classes in a few days.  I will not be addressing it at the CoordinateSystemCategory level since you are getting to the point of diminishing returns.  We could easily apply vatavian's refactoring at that level later.  BTW, if we make the changes at that level, we will need to also change the Reflection code in the DotSpatial.Projections.CoordinateSystemCategory class to find properties instead of fields, just like I had to do in the GeographicSystems and ProjectedSystems classes.

Kyle