pbharrin / machinelearninginaction3x Goto Github PK
View Code? Open in Web Editor NEWSource Code for Machine Learning in Action for Python 3.X
Source Code for Machine Learning in Action for Python 3.X
I'm getting an error with the classify0() function in the kNN.py module.
9 for i in range(k):
10 voteIlabel = labels[sortedDistIndicies[i]]
---> 11 classCount[voteIlabel] = classCount.get(voteIlabel, 0) + 1
12 sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)
13 return sortedClassCount[0][0]
TypeError: unhashable type: 'list'
I've taken the code from the pbharrin/machinelearninginaction3x github repo so it shouldn't be down to typos. I've run the code line by line outside of the function and I get the same error at the same point when I try to run the for loop. I'm not an experienced Python coder so it's a bit beyond me to try and solve this myself. I'm loving the book so far though.
def createInitSet(dataSet):
retDict = {}
for trans in dataSet:
retDict[frozenset(trans)] = 1
return retDict
the code retDict[frozenset(trans)] = 1, not consider when two transactions are the same.
should changed to
def createInitSet(dataSet):
retDict = {}
for trans in dataSet:
retDict[frozenset(trans)] = retDict.get(frozenset(trans), 0) + 1
return retDict
def aprioriGen(Lk, k): #creates Ck
retList = []
lenLk = len(Lk)
for i in range(lenLk):
for j in range(i+1, lenLk):
L1 = list(Lk[i])[:k-2]; L2 = list(Lk[j])[:k-2]
L1.sort(); L2.sort()
if L1==L2: #if first k-2 elements are equal
retList.append(Lk[i] | Lk[j]) #set union
return retList
I think the L1 and L2 should sort first, then extract the front k-2 items to compare. Take the example:
L1 = [1,2]
L2 = {3, 1}
Though, when a set converted to a list, it's usually sorted.
Steps to reproduce:
Traceback (most recent call last):
File "/usr/local/src/github/machinelearninginaction3x/Ch02/kNNTest.py", line 10, in
from Ch02 import kNN
ModuleNotFoundError: No module named 'Ch02'
When I changed "from Ch02 import kNN" to "import kNN", it works.
It seems that when you in the Ch02 directory, you should import a module directly.
in kMeans.py in CH10. it does not work in python3.
def randCent(dataSet, k):
n = shape(dataSet)[1]
centroids = mat(zeros((k, n)))
for j in range(n):
minJ = min(dataSet[:,j])
rangeJ = float(max(dataSet[:,j]) - minJ)
i got a error
rangeJ = float(max(dataSet[:,j]) - minJ)
TypeError: unsupported operand type(s) for -: 'map' and 'map'
def generateRules(L, supportData, minConf=0.7): #supportData is a dict coming from scanD
bigRuleList = []
for i in range(1, len(L)):#only get the sets with two or more items
for freqSet in L[i]:
H1 = [frozenset([item]) for item in freqSet]
if (i > 1):
rulesFromConseq(freqSet, H1, supportData, bigRuleList, minConf)
else:
calcConf(freqSet, H1, supportData, bigRuleList, minConf)
return bigRuleList
When the frequent set is like {1,2,3}, it can't generate the rules like {1,2}->3.
Shoule the snippet
if (i > 1):
rulesFromConseq(freqSet, H1, supportData, bigRuleList, minConf)
else:
calcConf(freqSet, H1, supportData, bigRuleList, minConf)
change to
rulesFromConseq(freqSet, H1, supportData, bigRuleList, minConf)
if (i > 1):
calcConf(freqSet, H1, supportData, bigRuleList, minConf)
I read the book "Machine Learning in action" and repeat code from the chapter 2 kNN, but i got error
"ValueError: could not convert string to float:" on the string
machinelearninginaction3x/Ch02/kNN.py
Line 48 in c1ea22f
line 99
del(list(trainingSet)[randIndex])
seems that the deletion does not work
maybe change to:
del(trainingSet[randIndex])
When we attempt to classify a document, we multiply a lot of probabilities together to
get the probability that a document belongs to a given class. This will look something
like p(w0|1)p(w1|1)p(w2|1). If any of these numbers are 0, then when we multiply
them together we get 0. To lessen the impact of this, we’ll initialize all of our occurrence counts to 1, and we’ll initialize the denominators to 2
First of all, huge thanks for creating an understandable ML book.
Coming back to question now:
I perfectly, understood the idea behind setting p0Num
and p1Num
to np.ones
, but why you have set p0Denom
and p1Denom
both to 2?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.