Looking at one of the larger files in Libre Office:
com.sun.star.filter.config.tools.utils.Cache
This file has been removed in the above repository – and for good reasons.
I guess that the file  was once a closed source file as the com.sun package name suggests, and I found it only fair to expose some of the defective code for a closed source after writing about dotCMS, because – as I stated then: The problem is in both worlds.
Opening the Libre Office repository this Friday, spending some time to get statistics to suggest looking at a specific group of source files I found this. I must say it is immediate ugly when you just browse through the code. But once you really start to examine it you would have to be a bit masochistic to work on it.
The code is really horrible on several levels:
- Coupling implementation types instead of interfaces, or Interface vs. specific class
- Primitive Obsession
- Non-enumerable enumerable
- Magic Strings and numbers
- Dynamic objects of known entities
- Modification of parameters and poor understanding of Map
- System.out.println + a logger available.
- Cyclic dependency with XMLHelper
- StringBuffer: Setting size, then using +
- Correcting spelling mistakes, but apparently not all of them
- Unrolled loops
- Hungarian misnomer
- Missing set functionality
- Keeping current
- Not conforming to standard Java code convention for packages
- Sloppy copy-paste
Coupling implementation types instead of interfaces
Using HashMap as the field reference type is considered bad as it exposes too much specific knowledge of the object. As the variables are simply used as maps, there is no reason why they can’t be defined as Map.
That said the Maps shouldn’t really be raw types. Trying to change this reveals another issue: Dynamic Objects of know entities.
Primitive Obsession
No specific modeling is applied to the objects in the solution even though a lot of specific knowledge is assumed. This is bad as assumptions rarely hold up.
Non-enumerable enumerable
Flags, E_types, and CFG are specific enumerable types, but they are defined specifically, which usually is a disaster waiting to happen and unnecessarily hampers development.
Magic Strings and numbers
The code is littered with magic strings and numbers – even though there are constant definitions as well. This often leads to undesirable issues as a single type can go undetected.
Dynamic objects of known entities
This is a combination of the raw types mentioned under coupling and primitive obsession. The maps are used to store entities defined by the PROPS constants.
<sarcasm>If only there was a way in which one could  define an object of known entities – oh wait, that’s one of the benefits of a programming language.</sarcasm>
E.g. m_lTypes should be a Map<String, Map<String, Object>> if not sticking to the raw type. The keys to the value map are defined by:
Key
|
Value type
|
PROPNAME_CLIPBOARDFORMAT
|
String
|
PROPNAME_CONTENTHANDLER
|
String
|
PROPNAME_DETECTSERVICE
|
String
|
PROPNAME_DOCUMENTICONID
|
Integer
|
PROPNAME_EXTENSIONS
|
Vector<String>
|
PROPNAME_FRAMELOADER
|
String
|
PROPNAME_MEDIATYPE
|
String
|
PROPNAME_PREFERRED
|
Boolean
|
PROPNAME_PREFERREDFILTER
|
String
|
PROPNAME_UINAME
|
Map<String, String>
|
PROPNAME_UIORDER
|
Integer
|
PROPNAME_URLPATTERN
|
Vector<String>
|
m_lFilters is – on the surface – of the same type: Map<String, Map<String, Object>>, but the keys and types of the value map are defined as follows:
Key
|
Value type
|
PROPNAME_DOCUMENTSERVICE
|
String
|
PROPNAME_FILEFORMATVERSION
|
Integer
|
PROPNAME_FILTERSERVICE
|
String
|
PROPNAME_FLAGS
|
Integer
|
PROPNAME_ORDER
|
Integer
|
PROPNAME_TEMPLATENAME
|
String
|
PROPNAME_TYPE
|
String
|
PROPNAME_UICOMPONENT
|
String
|
PROPNAME_UINAME
|
Map<String, String>
|
PROPNAME_USERDATA
|
Vector<String>
|
m_lDetectServices, m_lFrameLoaders, and m_lContentHandlers are all of the type:
Map<String, Map<String, Vector<String>>>
And they seem to only really use PROPNAME_TYPES as the key to the value map.
Finally, m_lFilterToTypeRegistrations is on the following form:
Map<String, Vector<String>>
Besides that, it is not used as a field but as a local variable.
Modification of parameters
In the method convertDetectServicePropsToExternal and others have issues with side effects on the parameters. Whether this is to be a flaw of Java’s implementation of Map or the programmers lacking understanding is difficult to say.
private static java.util.HashMap convertDetectServicePropsToExternal(java.util.HashMap aMap ,
int nFormat)
throws java.lang.Exception
{
java.util.HashMap aResultMap = null;
switch(nFormat)
{
//-----------------------------------
case FORMAT_60 :
{
// no changes!
aResultMap = aMap;
}
break;
//-----------------------------------
case FORMAT_6Y :
{
// remove localized name
aResultMap = aMap;
aResultMap.remove(PROPNAME_UINAME);
}
break;
//-----------------------------------
default :
throw new java.lang.Exception("unknown format");
}
return aResultMap;
}
Probably the intention was to return a different instance with the same basic values as the given map. The problem is that the assignment, aResultMap = aMap, makes the pointer for either point to the same data structure in memory, thus the call to remove removes the key and value from whatever the pointer is pointing to, thereby removing it from the original as well. What’s worse is that the pointer is returned.
The solution: Use putAll()
private static Map convertDetectServicePropsToExternal(Map aMap, int nFormat){
if (!(nFormat == FORMAT_60 || nFormat == FORMAT_6Y)) {
throw new IllegalArgumentException("unknown format: " + nFormat);
}
Map aResultMap = new HashMap();
aResultMap.putAll(aMap);
if (nFormat == FORMAT_6Y) {
aResultMap.remove(PROPNAME_UINAME);
}
return aResultMap
}
System.out.println + a logger available.
If you want to output information, then you should make sure that the information is available to be read. System.out.println() really isn’t doing that and a logger ought to be used instead. As it stands, then the class is actually using a logger, but on top of that there are still calls to System.out.println.
Cyclic dependency with XMLHelper
Cache uses XMLHelper to extract elements from XML – but XMLHelper uses some of the constants defined in Cache. While this seems to be a minor issue, the problem becomes evident when XMLHelper needs to help something else adapting the constants from that entity, which Cache then will be indirectly depending on. Thus starts a big ball of mud.
StringBuffer: Setting size, then using +
StringBuffer is used to append values in a way to circumvent String concatenation, which is costly due to Strings immutability. So if you are using StringBuffer, and you start defining the initial buffer size, then please don’t misuse the purpose for which StringBuffer was intended.
There is a probability that the compiler will spot the less than splendid blemish and help you out.
Correcting spelling mistakes, but apparently not all of them
It seems that the code – under the Libre Office banner – has been modified a handful of times, mostly due to spelling mistakes in the comments. This is not the biggest issue in the code base but if you are really adamant about the text quality, then please fix all the mistakes.
“How these entries will be readed or written can be switch in different modes.”
“superflous”
“They dont have any valid document service name set and cant be handled”
Unrolled loops
It puzzles me that the same code is repeated instead of added to a method. This kind of unrolling or in-lining can be seen in toHTML and other methods, e.g. convertFilterFlagNames2Values and convertFilterFlagValues2Names, which we shall dive into later.
We see something similar in the toXML method where there is a loop over the magic values 0 to 5, which actually corresponds to the enumerable eType, which means that adding a new type you will likely miss this loop termination.
java.util.Iterator aIt = m_lTypes.keySet().iterator();
while (aIt.hasNext())
{
java.lang.String sType = (java.lang.String)aIt.next();
java.util.HashMap aType = (java.util.HashMap)m_lTypes.get(sType);
sRelationView.append("<tr>");
sRelationView.append("<td>"+sType+"</td>");
java.lang.String sVal = (java.lang.String)aType.get(PROPNAME_DETECTSERVICE);
if (sVal == null || sVal.length()<1)
sRelationView.append("<td> - </td>");
else
sRelationView.append("<td>"+sVal+"</td>");
sVal = (java.lang.String)aType.get(PROPNAME_PREFERREDFILTER);
if (sVal == null || sVal.length()<1)
sRelationView.append("<td> - </td>");
else
sRelationView.append("<td>"+sVal+"</td>");
sVal = (java.lang.String)aType.get(PROPNAME_FRAMELOADER);
if (sVal == null || sVal.length()<1)
sRelationView.append("<td> - </td>");
else
sRelationView.append("<td>"+sVal+"</td>");
sVal = (java.lang.String)aType.get(PROPNAME_CONTENTHANDLER);
if (sVal == null || sVal.length()<1)
sRelationView.append("<td> - </td>");
else
sRelationView.append("<td>"+sVal+"</td>");
sRelationView.append("</tr>");
}
Could be replaced by something ugly but less verbose.
Iterator<String> aIt = m_lTypes.keySet().iterator();
while (aIt.hasNext())
{
String sType = aIt.next();
Map aType = (Map)m_lTypes.get(sType);
sRelationView.append("<tr>");
sRelationView.append(tableData(sType));
sRelationView.append(tableData((String) aType.get(PROPNAME_DETECTSERVICE)));
sRelationView.append(tableData((String) aType.get(PROPNAME_PREFERREDFILTER)));
sRelationView.append(tableData((String) aType.get(PROPNAME_FRAMELOADER)));
sRelationView.append(tableData((String) aType.get(PROPNAME_CONTENTHANDLER)));
sRelationView.append("</tr>");
}
private String tableData(String sVal) {
StringBuffer sb = new StringBuffer();
sb.append("<td>");
if (sVal == null || sVal.length()<1)
sb.append(" - ");
else
sb.append(sVal);
sb.append("</td>");
return sb.toString();
}
Hungarian misnomer
My personal belief is that Hungarian notation is misleading and verbose – for strongly typed languages – in the same manner that method comments can be misleading if not updated in unison with code changes.
The code has several misleading Hungarian notations, e.g.
/** list of all located types.
*Â Format: [string,HashMap]
*/
private java.util.HashMap m_lTypes;
Maps should – according to http://msdn.microsoft.com/en-us/library/aa260976(v=vs.60).aspx – be mpXY prepended thus for the map at hand it would be:
private java.util.HashMap m_mpSMPTypes;
Whether the internal maps types should be a part I have no idea. I would rather have the Java notation correct, that is:
private Map<String, Map<String, Object>> types;
And if fixing another issue – namely that which is caused by the Dynamic entities – where the Map<String, Object> really ought to be its own entity, e.g. Type, then:
private Map<String, Type> types;
But if there is a necessity for Hungarian notation, then stick to it.
Missing set functionality
Often the programming language used has an implication on the solution provided, sometimes for the good, e.g. QuickSort is easy in a functional language, e.g. Haskell (see http://www.haskell.org/haskellwiki/Introduction#Quicksort_in_Haskell).
quicksort :: Ord< a => [a] -> [a]
quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
where
lesser = filter (< p) xs
greater = filter (>= p) xs
Sometimes, though, the implication is for the worse. Javas lack of native support for Sets and Maps make developers choose weird constructs instead. Such as the if-construct in validateTypes:
if (
(!aType.containsKey(PROPNAME_MEDIATYPEÂ Â Â Â Â )) ||
(!aType.containsKey(PROPNAME_PREFERREDÂ Â Â Â Â )) ||
(!aType.containsKey(PROPNAME_CLIPBOARDFORMAT)) ||
(!aType.containsKey(PROPNAME_DOCUMENTICONID )) ||
(!aType.containsKey(PROPNAME_URLPATTERNÂ Â Â Â )) ||
(!aType.containsKey(PROPNAME_EXTENSIONSÂ Â Â Â )) ||
(!aType.containsKey(PROPNAME_UINAMEÂ Â Â Â Â Â Â Â ))
)
{
throw new java.lang.Exception("Type \""+sType
+ "\" does not contain all neccessary properties for a 6.0/6.Y format.");
}
The basic idea is that the type must have all the keys set. Naturally if the type was a real object and not a Map, the validation check should be on that entity.
If on the other hand the properties test were in a Set, e.g. TYPE_REQUIRED_KEYS, then:
StringBuffer missingKeys = new StringBuffer();
for (String required : TYPE_REQUIRED_KEYS) {
if (!aFilter.containsKey(required)) {
missingKeys.append(required).append(",");
}
}
if (missingKeys.length() > 0) {
throw new Exception("Type \"" + sType
+ "\" does not contain all necessary properties for a 6.0/6.Y format: "
+ missingKeys.toString());
}
Would provide the same check, but it would be guarded against extension, and the exception thrown hold more information.
Furthermore, the code would do one thing and one thing only: Check the validity. Not checking the validity and keeping track of what constitutes required keys.
Keeping current
It may seem like a weird requirement to keep current and update code which works just for the sake of being current.
This is a simple amortized development cost. Doing a little work today to keep the code current helps you to not face an impossible leap once the backwards compatibility is dropped. It keeps you going forward, and you will have a current understanding of the codebase. That is, you won’t have any weird blobs of code that only one programmer knows how to use. You alleviate some of the dependencies of people and libraries.
At another time scale this is comparable to having documents in hieroglyphs along with modern writing. If Shakespeare is difficult to understand, then perhaps the language has just moved along.
These are the reasons for keeping current, which means that todays’ codebase should be Java 7 compliant and at least Java 6 compliant. Which again means that enums and generics should be used, as should the enhanced for-loop, which should kill off a lot of the iterators throughout the code.
Not conforming to standard Java code convention for packages
There lots of ways a Java project can be defined, but apparently coming out of Sun the least they could have done was to map to the natural layout of each package element matching a folder. This is not the case here, where the com.sun.star.filter.config.tools.utils package maps to the l10ntools/source/filter/utils folder.
For the entire Libre Office polyglot project there are even other weird twist and turns in which test code is on the source path. I haven’t seen that recently.
Sloppy copy-paste
The code you look at today tells you how you will write the code tomorrow. That is, if you are looking at ugly hacks today, then tomorrow you will make new ugly hacks.
If something looks like copy and paste, then that is the way you do it. If you’re not quite aware of what you are doing, then you’ll be sloppy and miss out on a few needed alterations. E.g.
if((field & FLAGVAL_COMBINED) == FLAGVAL_SUPPORTSSELECTION)
lFlags.add(FLAGNAME_SUPPORTSSELECTION);
This check will always fail, as FLAGVAL_COMBINED doesn’t have any matching values with FLAGVAL_SUPPORTSSELECTION. It should probably have been:
if((field & FLAGVAL_SUPPORTSSELECTION) == FLAGVAL_SUPPORTSSELECTION)
lFlags.add(FLAGNAME_SUPPORTSSELECTION);
Even so, this is only half of one of the related problems. The code resides in convertFilterFlagValues2Names, but there is a reverse mapping required in convertFilterFlagNames2Values. The problem though is that in that method there is no such check.
One solution would be to have the flags as an enum, and have the two bidirectional mappings work on the entirety of the enum scope. That would allow an addition to the enum to automatically cater for the bidirectional mapping without having to touch up everywhere.
public enum Flag {
THIRDPARTYFILTER (0x00080000, "3RDPARTYFILTER"),      // 524288
ALIEN            (0x00000040, "ALIEN"),               // 64
ASYNCHRON        (0x00004000, "ASYNCHRON"),           // 16384
BROWSERPREFERRED (0x00400000, "BROWSERPREFERRED"),    // 4194304
CONSULTSERVICE   (0x00040000, "CONSULTSERVICE"),      // 262144
DEFAULT          (0x00000100, "DEFAULT"),             // 256
EXPORT           (0x00000002, "EXPORT"),              // 2
IMPORT           (0x00000001, "IMPORT"),              // 1
INTERNAL         (0x00000008, "INTERNAL"),            // 8
NOTINCHOOSER     (0x00002000, "NOTINCHOOSER"),        // 8192
NOTINFILEDIALOG  (0x00001000, "NOTINFILEDIALOG"),     // 4096
NOTINSTALLED     (0x00020000, "NOTINSTALLED"),        // 131072
OWN              (0x00000020, "OWN"),                 // 32
PACKED           (0x00100000, "PACKED"),              // 1048576
PREFERRED        (0x10000000, "PREFERRED"),           // 268435456
READONLY         (0x00010000, "READONLY"),            // 65536
TEMPLATE         (0x00000004, "TEMPLATE"),            // 4
TEMPLATEPATH     (0x00000010, "TEMPLATEPATH"),        // 16
USESOPTIONS      (0x00000080, "USESOPTIONS"),         // 128
COMBINED         (0x00800000, "COMBINED"),            // 8388608
SUPPORTSSELECTION (0x00000400, "SUPPORTSSELECTION");Â Â Â // 1024
private int bitpattern;
private String name;
private Flag(int bitpattern, String name) {
this.bitpattern = bitpattern;
this.name = name;
}
public static int toBitPattern(List<Flag> flags) {
int val = 0;
for (Flag f : flags) {
val |= f.bitpattern;
}
return val;
}
public static List<Flag> toList(int bitpattern) {
List<Flag> flags = new ArrayList<Flag>();
for(Flag flag : Flag.values()) {
if ((flag.bitpattern & bitpattern) == flag.bitpattern) {
flags.add(flag);
}
}
return flags;
}
}
Most likely the name field isn’t needed anymore.
Conclusion
It does not matter whether code is closed source or open source, there is ugly and flawed in both worlds. Most likely there is something beautiful in both as well.
I just cannot accept the poor quality we seem to use as foundation for almost any profession. There is sloppiness an hacks all around. Expensive bugs and clunky solutions, which make everyone's daily digital life more frustrating than it possibly has to be.
You cannot improve on the foundation - the quality of a solution with dependencies is deteriorating like compound interest with a negative rate. If your dependencies accumulate to 90% the best you can hope for is to do 100% yourself, leaving the resulting solution at 0.9*1.0 = 90%. If - on the other hand - you are only delivering 90% quality yourself, the result will be 0.9*0.9 = 81%
Could I do it better? No, most likely not. I'm not claiming to write perfect code. What is frustrating though is that I can easily spot the massive pitfalls and the ugliness of solutions.
Given the short amount of time I've spent on this and the ease of which it was to find a problematic file my estimate is that there is too much poor quality code in the world and not enough people who cares. Just remember that the code used to handle your flight and luggage, audit your tax returns, ship your parts, handle your bank account, etc. is probably no better.