Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
SmileMiner – A Java library of state-of-art machine learning algorithms (github.com/haifengl)
118 points by haifeng on March 2, 2015 | hide | past | favorite | 25 comments


A little tip to the authors, migrate the project form an Eclipse-based one to Maven or Gradle. Not everyone uses Eclipse and it's more friendlier that way.


Maven pom.xml files are added. Thanks!


Cool, though a more community friendly license would be great ("Confidential Proprietary" doesn't seem encouraging...)


What? From the license file it appears to be Apache https://github.com/haifengl/smile/blob/master/LICENSE


It does say "Confidential Proprietary" on top of .java files. Maybe they forgot to update those when they placed it on github?


It is Apache 2.0. The library has been developed for several years as close source. But now it is Apache 2.0 although the source files need to be updated. Don't worry.


I find it somewhat funny that the term "machine intelligence" seems like it was coined in part to distinguish it from "machine learning", but, while this project is clearly part of the latter, they have chosen a moniker using "machine intelligence". Obviously no one owns the word, but it just seems people love using new buzz words and efforts to distinguish these things are quickly muddied.


The project was started five years ago. It is not really about buzz words.


I'd be curious as to how this compares to https://github.com/SmileWide/main, in terms of maturity and ability to run in a distributed mode.


Surprised this was written in Java - I thought Lisp did well in this area and Clojure is right there on the JVM.


It's easier to achieve a high level of performance in raw, idiomatic Java than in Clojure. Eg, it's really important to be able to mutate arrays directly and use primitives with no boxing. You can do that kind of thing in Clojure, but it just ends up looking like Java with parens instead of squirrely brackets and more explicit casting.

The proper use for Clojure is to express higher-order application logic more concisely, not to implement high performance math algos.


Well, there are "high-performance things" I like more in Clojure, for example:

- areduce / amap

- Uncleaned data is often of Object type (due to missing or invalid values) upon which one can apply the complete core.reducer functions without intermediate representations. In Java this would be horrible.

Still, you're right in the sense that Clojure execution isn't fast than Java's, just its development.


Transients (mutable Clojure collection types) are also high performance.


Mahout and especially Weka are popular in the field, both written in java.


I agree with fiatmoney, and would also like to add that if you implement it in simple Java w/few dependencies, it makes it easier to use the library in many contexts. e.g. other JVM languages.


Yes, we don't want it depending on third parties.


Cool stuff! How does it differ from other java ML libraries (like Weka)?


Technical merits aside, Weka is GPL-licensed (commercial licenses available), and this library Apache 2.0.


This is an amazing, thank you for sharing!


Rescale image thumbmails on github readme. It takes forever to load and costs fortune on mobile internet.


definitely should resize the thumbnails... instead of using for example an image that is 2640x1854 being displayed at 345x242


Also unprivileged version would be nice. Java Web Start or applet. People still use those.


Awesome, awesome, awesome


would be great if there was a similar library for python


You mean something like scikit-learn?

http://scikit-learn.org/stable/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: