Two-step Cluster Analysis in SPSS

This is a two-step cluster analysis using SPSS. I do this to demonstrate how to explore profiles of responses. These profiles can then be used as a moderator in SEM analyses. The data is available on the homepage of StatWiki.

Пікірлер: 225

  • @Gaskination
    @Gaskination3 жыл бұрын

    Here's a fun pet project I've been working on: udreamed.com/. It is a dream analytics app. Here is the KZread channel where we post a new video almost three times per week: kzread.info/dron/iujxblFduQz8V4xHjMzyzQ.html Also available on iOS: apps.apple.com/us/app/udreamed/id1054428074 And Android: play.google.com/store/apps/details?id=com.unconsciouscognitioninc.unconsciouscognition&hl=en Check it out! Thanks!

  • @duran099
    @duran09910 жыл бұрын

    Thank you! I enjoyed the back and forth of your problem shooting at the start for which variables to use. Made it more real, and gave some context from a theory perspective.

  • @blackchallice
    @blackchallice11 жыл бұрын

    WOW, you have explained cluster analysis very clearly. This is the first time I'm learning CA and I totally get it. Thank you!

  • @krismatthews7550
    @krismatthews75508 жыл бұрын

    You seriously just saved my Quantitative Analysis project :] THANK YOU!

  • @yuriveneziani8029
    @yuriveneziani80296 жыл бұрын

    Amazing explanation... clear and direct! Thank you!

  • @vide0gameCaster
    @vide0gameCaster8 жыл бұрын

    Dude you don't understand how this vid helped me for my statistic exam. I aced my test thanks to you! You just gain a subscriber!

  • @mxm001
    @mxm0019 жыл бұрын

    Thank you SO much, James. This was very helpful.

  • @TulioMaia
    @TulioMaia12 жыл бұрын

    Thank you so much! I'm a starter on SPSS. I'm a R user, but i'm gonna start SPSS from now! Thanks again!

  • @MarcRodrigues10
    @MarcRodrigues105 жыл бұрын

    Thank you! This video helped me a lot, especially with the results analysis.

  • @samirsarsamss
    @samirsarsamss10 жыл бұрын

    Many thanks dear James Gaskin for this helpful video, please go ahead with other different aspects or even tools.

  • @tekonen
    @tekonen9 жыл бұрын

    Thanks for sharing your knowledge!

  • @lefkiospaik
    @lefkiospaik10 жыл бұрын

    Great presentation! Moreover the "not suitable" variables you chose in the beginning, really helped a lot to understand more on the cluster analysis. Thanks

  • @talhelmt
    @talhelmt11 жыл бұрын

    Thanks! I appreciate the time you put into making this.

  • @koenovisch
    @koenovisch11 жыл бұрын

    Thank you for your reaction! I will continue looking for it!

  • @chrisnahm
    @chrisnahm7 жыл бұрын

    Really enjoyed this and was very helpful. Thank you!

  • @user-qy6mx4oq9i
    @user-qy6mx4oq9i4 жыл бұрын

    thank you for awesome explanation! wish you good luck! I've found all your videos very very very helpful

  • @ildilovasz2982
    @ildilovasz29824 жыл бұрын

    Thank you for this video, very clear and it helped me write my thesis.

  • @Xirukah
    @Xirukah11 жыл бұрын

    You're a great guy!! I study SPSS in College in three levels.. Introduction to Data Analysis, Univariate Data Analysis and Multi-Variate Data Analysis for 3rd level. In this moment i'm on 3rd and this process is really usefull! Thank You!!

  • @AlbertGavino
    @AlbertGavino9 жыл бұрын

    great simple video on 2 step clustering (great for categorical variables or binary ones) with some continuous variables.But I like 2 step since it creates it's own clusters of which I don't have to specify (unlike in K-means)

  • @vshapoval
    @vshapoval10 жыл бұрын

    I do not have questions, but I found your video extremely helpful with very good explanation So I only wanted to say thank you. Your video was a great help. =)

  • @DaDonnyZhang
    @DaDonnyZhang10 жыл бұрын

    Great video! Thank you so much!

  • @petradubajovamarinakova9268
    @petradubajovamarinakova926810 жыл бұрын

    Your video helped me. Thank you very much :)

  • @snailbby6664
    @snailbby66647 жыл бұрын

    "These are the ones you'll probably punish by making them managers" 😂

  • @Jemoeder86
    @Jemoeder8610 жыл бұрын

    Very informative! Thanks

  • @sticky924
    @sticky92410 жыл бұрын

    Thank you for this video, it is very helpful

  • @Gaskination
    @Gaskination11 жыл бұрын

    Not a stupid question because I had to look up the answer :) The SPSS help manual says that the two-step cluster analysis assumes normally distributed data for all continuous variables, but that tests have shown it to be robust enough to handle non-normal data fairly well.

  • @ynbearljx
    @ynbearljx7 жыл бұрын

    Great video, thank you!

  • @zhexiongtao2167
    @zhexiongtao216710 жыл бұрын

    really interesting and helpful! Hope you can also make one for K-Means

  • @Gaskination
    @Gaskination11 жыл бұрын

    Funny you should ask! I was just considering doing this yesterday. I will probably do a K-means cluster, and also show how to segment the data and explore clusters for sub-populations. This is definitely on my to do list.

  • @yifanli4312
    @yifanli43125 жыл бұрын

    Thank you! This vedio is very helpful!

  • @juliaworldwide
    @juliaworldwide8 жыл бұрын

    Thank you very much for that !

  • @Gaskination
    @Gaskination10 жыл бұрын

    You can certainly try k-means. It just depends on what your research intentions are. I actually prefer k-means over two-step. I just learned two-step first, so that's what I made the video for. I should probably make one for k-means sometime...

  • @wassdepp1
    @wassdepp17 жыл бұрын

    Thank you, It made my day

  • @JohnParavantis
    @JohnParavantis4 жыл бұрын

    If I may, at 9:01 I would like to correct your reference to the boxplot: the middle line does indeed represent the median, but the left and right edges of the box lie at the first and third quantile respectively. So, rather than representing one standard deviation below and above the mean, the box represents the middle 50% of the observations. Thank you very much for the video, very lucid explanation of swamping variables, still very useful in 2019!

  • @Gaskination

    @Gaskination

    4 жыл бұрын

    Thanks!

  • @SharonaTLevy-nl4dc
    @SharonaTLevy-nl4dc8 жыл бұрын

    thank you, very helpful

  • @ntaalya
    @ntaalya10 жыл бұрын

    Thank You very much!

  • @emindeger.
    @emindeger.4 жыл бұрын

    Hi thank you very much for this video series. I have a question, I would appreciate it if you answer. Do we need to normalize the data in spss?

  • @olofreichenberg6885
    @olofreichenberg688511 жыл бұрын

    Very helpful!

  • @medosman23
    @medosman239 жыл бұрын

    great video thank you

  • @Gaskination
    @Gaskination11 жыл бұрын

    Thanks for the ideas. I just do these when the need arises or when I have the time. I'll probably have some time to do a couple next week. I have some data that has grouping variables, so no need to send me yours. Thank you though.

  • @researcher53
    @researcher5311 жыл бұрын

    Thanks, very helpful

  • @TheCopginger
    @TheCopginger11 жыл бұрын

    That's great indee! Well, I also have some ideas on how you could make it better from learner point of view. 1. Explaining why use certain/specific methodology for clustering 2. Producing it from basic to advanced methodology 3. Probably using data across industry/sector I dont know how much time you have to spend on these and you would want to, however I can provide you data which will enhance your quality of analysis. (and off course your self marketing value)

  • @polisherci
    @polisherci7 жыл бұрын

    Hey, can you run a regression clustered by a certain variable on SPSS? like the regress ... cluster (.. ) command in stata?

  • @Gaskination

    @Gaskination

    7 жыл бұрын

    I'm not sure. I haven't used STATA much. You can run a cluster analysis, and then use those clusters as grouping variables when running regressions.

  • @hem135
    @hem1355 жыл бұрын

    Hi James - This video is very helpful, thank you! Within the model viewer, I can see the average silhouette statistic for the cluster result. My understanding is this number is the average fit across item in the cluster. Is there a way to find the silhouette data for each item separately? For context, I'm using cluster analysis to identify exemplar scenarios for different types of behavior. I'm clustering scenarios based on participant ratings (e.g., this scenario represents X behavior, yes/no). I'd like to compare fit across a few different types of participant groups using an ANOVA of the silhouettes for each item. Thanks in advance!

  • @Gaskination

    @Gaskination

    5 жыл бұрын

    If there is a way, I'm not sure how to do it.

  • @Gaskination
    @Gaskination11 жыл бұрын

    Look at the sig value. If it is less than 0.05, then it is the groups are significantly different for that variables of comparison. If it is poor quality, then you might try a three factor model. Not sure you can rely on the cluster groups when they are poor. This means that the membership assignment was inconsistent based on the indicators used for the clustering. e.g., sometimes males went into cluster 1, sometimes in cluster 2.

  • @Gaskination
    @Gaskination11 жыл бұрын

    Glad to be helpful. Hope you'll subscribe and tell your friends. :)

  • @brandonknettel545
    @brandonknettel54510 жыл бұрын

    Hi there, thanks for the informative video. I ran this analysis for my data in two different ways and each time I got a single-cluster solution. I'm assuming that that is an indication that my participants are homogenous on the variables being studied, but when I run ANOVAs I am getting significant group differences. Is my best bet to run a k-group cluster analysis and force a distinction?

  • @MrMustav
    @MrMustav11 жыл бұрын

    Good tutorial

  • @gs19921
    @gs199218 жыл бұрын

    Thank you for this video I have done 4 different kmeans clustering and I need a method that choose the best clusteranalyses.Can I do it with twosteps or another method?

  • @Gaskination

    @Gaskination

    8 жыл бұрын

    +gs19921 Two step will provide a "fit" measure to let you know if the clustering solution was good. You can also examine the AIC (try to minimize it).

  • @sugun1993
    @sugun19935 жыл бұрын

    Thank you for the quick tutorial. I am performing two step clustering on a data from a recent study but wants to somehow fit this new data in the clusters generated from past data. Kind of like supervised learning, but neither the coefficients of the model of past data is not available nor the data, unfortunately. Is there a way to solve this or is this case hopeless? p.s. To get the project done in time, without access to any tools, I tried to put the new records in clusters, manually, respecting the features/characteristics of the previously generated clusters. Since the time is my major constraint and the data is just 40 new entries, I have already performed it (could you give me some idea about my options to justify the job done this way). But I am just curious to know the right way.

  • @Gaskination

    @Gaskination

    5 жыл бұрын

    If the new data is using the exact same variables as the original data, then you can simply add the new rows to the dataset and re-run the cluster analysis. That is the easiest way. If the new data is not using the same variables, then there is no statistical way to cluster them along the same lines.

  • @thomasbulitta3817
    @thomasbulitta38178 жыл бұрын

    Hi James, Thank you for that Video. It was very helpful. Do you know what actually happens "inside" SPSS when you this "Two-Step-Cluster"? Which forms of clustering are used? Single Linkage and hierarchial cluster analysis?

  • @Gaskination

    @Gaskination

    8 жыл бұрын

    +Thomas Bulitta It performs a hierarchical and a non-hierarchical step. I'm not sure which specific algorithms, but I bet the SPSS manual says.

  • @JustMe-pt7bc
    @JustMe-pt7bc11 жыл бұрын

    good inspiration for something new'!!!

  • @mcole6234
    @mcole623411 жыл бұрын

    James, Very informative. You mention the need for over 30 in the smallest cluster and between 2-3 for the largest: smallest ratio. I am dong a Phd and wondered where these numbers came from. Do you have an academic reference(s) I could cite. Also, at the end of the video when you ran an ANOVA from the newly formed variables in SPSS. I ran different analysis, and never had more than 4 clusters but there were 5 new variables, all with uniformative names. How do I know which ones to use?

  • @xiaoyanggong2006
    @xiaoyanggong20067 жыл бұрын

    Thanks!

  • @tomh3675
    @tomh367510 жыл бұрын

    Thanks for the video, do you have an example of doing a cluster analysis as a way of illustrating factor analysis/factor scores?

  • @Gaskination

    @Gaskination

    10 жыл бұрын

    No, but I do have several videos about how to do factor analysis and extract factor scores.

  • @roxy629
    @roxy6299 жыл бұрын

    Awesome! So clear and informational :) James, what would be the major differences between cluster analysis and factor analysis? Is it the profiling aspect? Can CA do things that FA cannot? Thanks again!

  • @Gaskination

    @Gaskination

    9 жыл бұрын

    roxy629 Cluster analysis clusters rows. Factor analysis "clusters" columns.

  • @roxy629

    @roxy629

    9 жыл бұрын

    James Gaskin ahhh!!! that's why it's called "profiling" makes so much sense thanks james :)

  • @Gaskination
    @Gaskination11 жыл бұрын

    That is what I meant, but those are undesirable sample sizes. You might also look at indicator importance to see if one variable is swamping out the others. If so, you might consider removing it. Or you can try K-means clustering... I haven't made any video for that yet...

  • @MrMustav
    @MrMustav11 жыл бұрын

    Great!

  • @MrNicks86
    @MrNicks8610 жыл бұрын

    Thanks for the great video - very useful! I was just wondering if you could explain (in a nutshell) the difference between this Two-Step cluster analysis and k-means? Thanks

  • @Gaskination

    @Gaskination

    10 жыл бұрын

    The main difference is that two-step allows you to distinguish between categorical and continuous variables, and it processes them differently. Whereas k-means just treats them all the same. So, if you have categorical variables, two-step would be a more accurate clustering.

  • @MrNicks86

    @MrNicks86

    10 жыл бұрын

    Thanks for your reply. So with continuous data like domestic energy use, would k means be more appropriate? And is it right to say that k means treats each variable as independent to the next, which in the case of domestic energy use is not quite the case? Many thanks again!

  • @Gaskination

    @Gaskination

    10 жыл бұрын

    Nicholas Samson Unfortunately, I'm not an expert in cluster analyses. So your question surpasses my immediate knowledge. I would just have to look it up. I know that there are some good documents and articles that discuss the differences between two-step and k-means. I just googled it. Best of luck to you.

  • @MrNicks86

    @MrNicks86

    10 жыл бұрын

    Thanks James!

  • @alfonspriessner8556
    @alfonspriessner85568 жыл бұрын

    Hi James! Very helpful video - you saved me a lot of time. :-) Unfortunately, I have two additional questions, and it would be great if you could help me. I am sure, you are the expert who can help me! 1) Lets assume SPSS program proposes 3 clusters based on a set of variables. What statistical tests are used for the selection of 3 clusters instead of 2 or 4 in the background? I read in some papers that e.g., likelihood-ratio (L2) and its p-value, the Bayesian Information Criterion (BIC) and the number of parameters (Npar) could be examples for these statistical tests (there are for sure others)? And if some of these tests are conducted by SPSS in the background, is there a way how I can create an output-chart of these statistical parameters in SPSS? In other words, since SPSS tells me 3 clusters, I would like to show why 3 clusters and not 4 based on a few statistical tests. 2) Lets assume we still have these 3 clusters from question 1 which were created based on a set of variables. But I have another variable (e.g., age) which I did not use for the cluster analysis. How (if there is any option in SPSS) can I calculate the mean of variable age for each of the 3 identified cluster and show it in an output table (best case for more than 1 additional variable). I hope you understand my questions. I would appreciate your help and guidance!! Thanks a lot in advance! Regards, Alfons

  • @Gaskination

    @Gaskination

    8 жыл бұрын

    1. SPSS let's you choose the AIC or the BIC as the clustering criterion, or you can use the silhouette measure that shows in the output. The silhouette is considered fairly robust. You can force it to 2 or 4 clusters as well to see what the silhouette score is for those. 2. Watch this video at the 2:16 mark. It will show how to do this using the Output button.

  • @Thanh-ThaoTPham
    @Thanh-ThaoTPham7 жыл бұрын

    Hi James, thanks for your valuable sharing. However, is there any source for the acceptable size of smallest cluster and threshold of ratio of sizes? Thanks in advance.

  • @Gaskination

    @Gaskination

    7 жыл бұрын

    I'm not sure. I'm really not an expert on cluster analysis. Those numbers just "feel" right, which I realize is not very scientific of me. I guess they feel right because they are practically useful - i.e., clusters of those sizes are usable in subsequent analyses and cluster ratios of that proportion break the data up into roughly equivalent groups.

  • @Thanh-ThaoTPham

    @Thanh-ThaoTPham

    7 жыл бұрын

    Thanks so much for your reply. Anw, I really love your tutorial series ^^

  • @user-ty5ie6nd7n
    @user-ty5ie6nd7n5 жыл бұрын

    thank you~

  • @nassimfard867
    @nassimfard8679 жыл бұрын

    tnx for the videos. Can you please tell me if a set of data can be clsutered only by one variable? and if yes is the two-step cluster more probable or the k-mean clustering? I want to categorize a set of data based on one variable in to three groups and i don't know how to define the cut-off or range for each categorie. I would be glad if you can help me

  • @Gaskination

    @Gaskination

    9 жыл бұрын

    Nassim Fard If it is just one variable, then clustering algorithms won't help. If the variable is categorical, then just group them based on the category values. For example, if the variable is religion, then group them by which religion they affiliate with. If the variable is continous or ordinal, then make logical cutoff points into low, med, high.

  • @123canuckfan
    @123canuckfan11 жыл бұрын

    God I wish you were my stats teacher!

  • @rajeshpandit3634
    @rajeshpandit36348 жыл бұрын

    Great video. I just want to check whether the variables you put both continuous and categorical, do you standardize them? Standardize I mean Z Normal variables as you are putting scale, binary, categorical variables together

  • @Gaskination

    @Gaskination

    8 жыл бұрын

    +Rajesh Pandit SPSS automatically standardizes all continuous variables when doing a 2-step cluster analysis. You can see this in the options area when doing the 2-step.

  • @Gaskination
    @Gaskination11 жыл бұрын

    Did you double click it? You have to double click it to make it show up.

  • @azianwacko
    @azianwacko8 жыл бұрын

    Hello again James, can you explain how the analysis actually creates the clusters? I've tried using it for categorical variables and I'm not fully understanding just how it determines the clusters. Thank you

  • @Gaskination

    @Gaskination

    8 жыл бұрын

    Here are some resources to help you understand 2 step cluster analysis better: 1. www.ibm.com/support/knowledgecenter/SSLVMB_21.0.0/com.ibm.spss.statistics.help/idh_twostep_main.htm 2. www.spss.ch/upload/1122644952_The%20SPSS%20TwoStep%20Cluster%20Component.pdf 3. www.ryerson.ca/~rmichon/mkt700/SPSS/TwoStep%20Cluster%20Analysis.htm 4. kzread.info/dash/bejne/ZICulMSOXdaod6Q.html

  • @TheCopginger
    @TheCopginger11 жыл бұрын

    Thanks Mr. Gaskination! would you also show much more complicated (both in terms of data and procedure) segmentation.

  • @souksomphoneanothay1149
    @souksomphoneanothay114911 жыл бұрын

    good video

  • @joseedupont2409
    @joseedupont240910 жыл бұрын

    Very helpful! What version of SPSS are you using?

  • @Gaskination

    @Gaskination

    10 жыл бұрын

    Probably v20 or 21 in this video. Maybe 19...

  • @OPaixao13
    @OPaixao138 жыл бұрын

    Hi James How can I get the Cubic Criterion Values at different number of clusters under consideration?? I think it's also a good way to justify why X number of clusters instead of Y, right??

  • @Gaskination

    @Gaskination

    8 жыл бұрын

    I'm not sure. I've never heard of the cubic criterion. Best of luck to you.

  • @serendipita5823
    @serendipita58238 жыл бұрын

    thank u 😃

  • @AdrienneDequina
    @AdrienneDequina8 жыл бұрын

    thanks a lot! i will use this in my phd die-ssertation lol

  • @koenovisch
    @koenovisch11 жыл бұрын

    James, do you know a video in which the IPA (importance/performance analysis) is being explained? Have you made such a video?

  • @cynthiagallagher75
    @cynthiagallagher757 жыл бұрын

    Is here a video that provides more detail on interpreting the clusters themselves? It would be helpful to understand how the clusters are being selected and how the clusters are developed.

  • @Gaskination

    @Gaskination

    7 жыл бұрын

    The only other two-step cluster analysis video that I have is part of the Rosen College SEM Boot Camp: kzread.info/dash/bejne/ZICulMSOXdaod6Q.html

  • @Zopzuita
    @Zopzuita11 жыл бұрын

    Great video! I only have a problem with the model viewer - it doesn't show up. The results are written in the column in my table but the output misses the interactive graphics. Does anybody else have the same problem? Any ideas how to fix this? Thanks!!!

  • @TheCopginger
    @TheCopginger11 жыл бұрын

    By the way, I was performing cluster analysis based on your video. However, I have few questions to ask you 1. Is it possible to assign weightage to individual record while performing segmentation? 2. If there is already weightage available for individual record (based on other criterion) how to make use of that in the segmentation process?

  • @azianwacko
    @azianwacko8 жыл бұрын

    Hello James, can you explain evaluation fields and whether something like a scale of mental health would go in there?

  • @Gaskination

    @Gaskination

    8 жыл бұрын

    +Thomas Chan Evaluation fields are used to see differences in evaluation variables based on cluster membership. It is sort of like doing an ANOVA on those variables, using the cluster membership as the factoring variable. The evaluation variables will not be used to determine cluster membership.

  • @bayankhalifa1543
    @bayankhalifa154310 жыл бұрын

    Very helpful, Prof. I did clustering for 2 continuous variables and 4 clusters, but how can I represent them in a high-high, high-low, low-high, and low-low matrix? Also, the two variables are highly correlated, will it be bad for clustering? Thanks.

  • @Gaskination

    @Gaskination

    10 жыл бұрын

    1. How to represent them: If you click on one of the button options in the table of clusters, you will see their distributions along the scale of measurement (low/high). The button looks like a distribution. This should help you represent them. 2. If they are highly correlated, then it might just be difficult to find low-high and high-low since they are probably mostly low-low and high-high.

  • @bayankhalifa1543

    @bayankhalifa1543

    10 жыл бұрын

    James Gaskin thanks, but which cluster is the high-high quarter in the matrix, which one is the high-low quarter, etc.? thanks

  • @Gaskination

    @Gaskination

    10 жыл бұрын

    bayan khalifa Just look at the distributions it shows you. If the bulk of the distribution is on the right, then it is high (assuming your scale went from low to high), if it is on the left, then it is low.

  • @harsin009
    @harsin0096 жыл бұрын

    Can these profiles really be used as a moderator in SEM analysis? Because I thought SEM only uses continuous variables since it analyzes relationship between multiple variables through regression analysis. For a while, I thought you were referring to Hierarchical Regression Analysis. Thank you!

  • @Gaskination

    @Gaskination

    6 жыл бұрын

    It can be used as a multigroup moderator for multigroup analysis, which is a form of moderation.

  • @TheAce0
    @TheAce06 жыл бұрын

    You mention that when having SPSS determine clusters automatically, Euclidean distance measurement is more appropriate but when specifying the number of clusters, Log-likelihood is preferred. Could you perhaps elaborate on why this is the case? Would you know any papers that go into a bit of detail about this?

  • @Gaskination

    @Gaskination

    6 жыл бұрын

    oooh, this has been a while. The literature I read at the time suggested these things, but I can't remember which articles and books I read, or what they had to say about it. Sorry about that. If cluster analysis was something I did more often, I would have a better answer for you. But I haven't done a cluster analysis again since making this video...

  • @TheAce0

    @TheAce0

    6 жыл бұрын

    Ah, okay, fair enough. I'm dealing with cluster analysis right now and need to figure out which parameters are appropriate and why :)

  • @spss-for-research6518
    @spss-for-research65189 жыл бұрын

    I have a dumb problem and I wonder if someone could help me. The SPSS shows the cluster comparisons only for the inputs, but NOT for the descriptive variables. It just shows a message: "the cluster comparison view encountered a problem and cannot display correctly" or something like that. Why? I can't figure out.

  • @Gaskination

    @Gaskination

    9 жыл бұрын

    spss-for-research I'm not sure. It may have something to do with the variables included. Try removing one variable at a time to see if you can identify which one is causing the problem. If it isn't that, then it may be a conflict in one of the libraries being utilized to run the analysis. If that is the case, then you might need to reinstall SPSS, or you might need to update your java or .NET version (not sure which one SPSS uses).

  • @sureshpatel3992
    @sureshpatel39923 жыл бұрын

    Hello James, can Two-step Cluster Analysis handle mixed variable type? Eg. some variables that are output of factor analysis (that will have negative values too), and some binary variables?

  • @Gaskination

    @Gaskination

    3 жыл бұрын

    Yes. The two-step method can handle all types of variables. The only thing you need to watch out for is highly skewed or kurtote variables, or discrete (categorical/nominal) variable without adequate representation from each group/category.

  • @sureshpatel3992

    @sureshpatel3992

    3 жыл бұрын

    @@Gaskination thanks so much for your reply, this would really help!

  • @MrMustav
    @MrMustav11 жыл бұрын

    Dear do you have a tutorial of logistic regression? Would be great!

  • @shahzadfarid6446
    @shahzadfarid64466 жыл бұрын

    Sir, Please upload detail lectures on Optimal scaling in SPSS (i.e. MCA, CATPCA and non-linear canonical correlation). These lectures are not available on KZread. I searched in your channel , with the hope ... , but unfortunately ....

  • @Gaskination

    @Gaskination

    6 жыл бұрын

    I have never done those, so I cannot make videos on them. Any time I learn a new analysis, I make a video for it. If I ever have occasion to do these, I'll make videos for them. Best of luck to you.

  • @mldsg72
    @mldsg729 жыл бұрын

    James, nice job, very well done! Do you mind to make a little comment about AIC and BIC on 2-step cluster?

  • @Gaskination

    @Gaskination

    9 жыл бұрын

    Marcelo Gabriel I was not aware you could generate AIC and BIC in SPSS during a 2-step cluster analysis. I've gone back to it to fiddle with it, but I can't figure it out if it is possible.

  • @mldsg72

    @mldsg72

    9 жыл бұрын

    James, thanks for your reply. At least on versions 20 and 22, you must check the "Clustering Criterion" by choosing BIC or AIC. I'm more inclined to consider AIC than BIC due to its characteristics. Your comment would be nice. Regards

  • @Gaskination

    @Gaskination

    9 жыл бұрын

    Marcelo Gabriel Thanks for pointing me to that. I played with it and looked into it and it appears that the results are often the same (with my data), but that in general, AIC is preferred to BIC. Here is an informative explanation of why as well as some useful references: en.wikipedia.org/wiki/Akaike_information_criterion#Comparison_with_BIC

  • @TulioMaia
    @TulioMaia12 жыл бұрын

    About the database you've used. Where did you get ir? Is it in the program itself?

  • @jeromeboissel2793
    @jeromeboissel27933 жыл бұрын

    Dear James, What references have you used on this occasion ? Besides, what would be most appropriate : K-means or Two-steps. In the paper I am working on, I have used both sets of analysis and, if the number of clusters remains the same, the number of respondants in each cluster differs quite significantly depending on which technique I use. Any tips ?

  • @Gaskination

    @Gaskination

    3 жыл бұрын

    I'm not much of an expert on cluster analysis. I've just used the Hair et al 2010 book. As for which approach to use, I think two-step is considered the most useful and valid, since t combines hierchical and non-hierarchical methods.

  • @JessicaRodrigues-wz3xo
    @JessicaRodrigues-wz3xo7 жыл бұрын

    Hi! How can I choose variables that are significant to use on it? There´s a statistical test to help? I have a lot of variables and I wanna know how I should choose them, if it has a criteria.

  • @Gaskination

    @Gaskination

    7 жыл бұрын

    Usually it is chosen theoretically, rather than statistically.

  • @JessicaRodrigues-wz3xo

    @JessicaRodrigues-wz3xo

    7 жыл бұрын

    Thank you for responding! I have several variables to draw a social and demographic profile of my population. Theoretically all these variables are important, but when I do the analysis with all of them, the results are not good. In other versions of SPSS there was a cut in those variables, a critical value, but I do not know how to identify this in SPSS 22. Can you help me, please?

  • @Gaskination

    @Gaskination

    7 жыл бұрын

    Jéssica Rodrigues you can look at the cluster quality or at the variable importance graph. These will give you indications of the overall value of the variables for clustering into groups.

  • @arieprabowo4675
    @arieprabowo46757 жыл бұрын

    do u have installer for spss 13? two step cluster only can be operate in spss version 13 i guess. thx before

  • @Gaskination

    @Gaskination

    7 жыл бұрын

    13? That is very old. SPSS is now on version 24. My version 24 runs the two step just fine. I don't have an installer though, as I'm not a licensed distributor.

  • @MrMustav
    @MrMustav11 жыл бұрын

    What if one of the item after applying post hoc shows a non significant p value e.g. you differentiate clusters on a variable, and then find that two of the clusters do not significantly differ on one item.

  • @Sari2024m
    @Sari2024m5 жыл бұрын

    I think you treat categorical variables as continuous which are categorical.

  • @olfabenarfa3790
    @olfabenarfa379010 жыл бұрын

    Very informative video and extremely helpful as usual. I have only one concern is that when I did it the first time it gave me 3 groups, I ran it again it gave me 2 groups,…I did it many times and I noticed that the results are not stable! How come that the same steps and same algorithm gave different results! Did anyone face this issue with the two steps cluster analysis? Thanks.

  • @Gaskination

    @Gaskination

    10 жыл бұрын

    That is bizarre... I'm not sure what would be causing that. It should be the same every time I think.

  • @tayeenulhoque1637
    @tayeenulhoque16379 жыл бұрын

    Can you please explain or suggest for likert sclae ordinal data which cluster analysis should apply ? Is it K-Means Cluster/Hierarchical/ Two step. Its it necessary to conduct CATPCA (categorical principal component analysis) prior to starting the cluster analysis, and can you please tell me after CATPCA how can I proceed for cluster analysis apparently the method. As I have four exogenous variable which contains 20 items.

  • @Gaskination

    @Gaskination

    9 жыл бұрын

    Usually we would use factor analysis for this kind of data. However, if you want to do a cluster, then I would do the EFA first and generate factor scores for each construct. Then use these factor scores in a cluster analysis. 2-step or k-means each offer slightly different features and analyses, so you could try both.

  • @tayeenulhoque1637

    @tayeenulhoque1637

    9 жыл бұрын

    Thank you Mr. James.. i really appreciate your valuable comments

  • @Zopzuita
    @Zopzuita11 жыл бұрын

    I can't doublecklick since the model viewer doesn't show up it all. It writes the clusters in the column but that's it - even though I activated the option...Any ideas what could be wrong? Thanks a lot in advance!

  • @DisconnectHack
    @DisconnectHack8 жыл бұрын

    Hi James, did you say "swarming variable" or "swapping variable"? I couldn't figure it out, and I have tried looking for definitions for both, only found "swapping variable" for computer science, were you talking about the same ?

  • @Gaskination

    @Gaskination

    8 жыл бұрын

    +DisconnectHack Swamping. I don't know what the technical term would be (or if there is one).

  • @DisconnectHack

    @DisconnectHack

    8 жыл бұрын

    +James Gaskin Thanks James, it appears there isn't one.

  • @nihonbunka
    @nihonbunka7 жыл бұрын

    Is it possible to analyse cluster NOT around central concepts like intelligence or years on the job but upon family relationship (binary relationship closeness in a network with the absence of commonalities, as is the case in real families).

  • @Gaskination

    @Gaskination

    7 жыл бұрын

    That's an interesting idea, but I don't know how to do it in a two-step. You might be able to do it with multiple alignment algorithms, but I'm not sure if SPSS has those...

  • @nihonbunka

    @nihonbunka

    7 жыл бұрын

    Thank you very much indeed. I have found a partial solution in the software here socnetv.org/downloads which has a network analysis network community detection algorithm which can be used on the correlation matrix produced by SPSS factor analysis. Others have had the idea before journals.plos.org/plosone/article?id=10.1371/journal.pone.0051558 using a different community detection algorithm Full statement of problem and partial solution www.talkstats.com/showthread.php/69145-Family-Relationship-version-of-Factor-analysis-for-Japanese-Groups?p=199672&highlight=#post199672

  • @Gaskination

    @Gaskination

    7 жыл бұрын

    cool! Thanks!

  • @brandonknettel545
    @brandonknettel54510 жыл бұрын

    I meant "k-means". Thanks.

  • @Gaskination
    @Gaskination11 жыл бұрын

    I don't yet, but people keep asking for one, so I should probably do one.

  • @Gaskination
    @Gaskination11 жыл бұрын

    I have not. Best of luck. But, basically it is like an R-squared analysis. It shows how much of the variance is being explained by each indicator.