Efficient Personality Assessment with the Big Five Dataset with MindsDB

I am excited to share with you the project I created using MindsDB and Node.js, which allows for the automated analysis of the Big Five Personality Test dataset from Kaggle. This project was inspired by my interest in psychology and my desire to explore the power of machine learning in personality assessment. I am writing this article in a way where you can implement it side by side #CIY (Code It Yourself).

For your reference, you can see the project at - github.com/deepam-kapur/Big-Five-Personalit.. (I will try to update the project to refine it, you can also do pull-request if you want to change something)

Personality is a complex construct that has always fascinated researchers and psychologists. It is a trait that helps to define an individual's unique characteristics, behaviours, and attitudes. The Big Five Personality Traits, also known as the Five-Factor Model, is one of the most widely used personality assessment tools in psychology. It consists of five personality dimensions: openness, conscientiousness, extraversion, agreeableness, and neuroticism. These dimensions are believed to be the fundamental traits that define an individual's personality.

#MindsDB (cloud.mindsdb.com) is an open-source tool that enables machine learning automation. It simplifies the process of building predictive models and allows users to create machine learning models without writing any code.

To begin the project, the Big Five Personality Test dataset was first loaded into MindsDB. The dataset contains 50 questions, and each question has five possible responses, ranging from "Strongly Agree" to "Strongly Disagree." The dataset was then preprocessed to convert the responses to numerical values ranging from 1 to 5.

Reference to Kaggle dataset - kaggle.com/datasets/tunguz/big-five-persona..

MindsDB uses a machine learning algorithm to predict the final personality type of the user based on their responses to the questions. The algorithm is trained on a subset of the dataset and uses this training to predict the personality type of new users. The predicted personality type is returned to the server, which sends it back to the client as an HTTP response.

I first processed the dataset to get the personality_type for the current dataset. For that, I executed the below query to calculate that.

SELECT  *, CONCAT(
        CASE WHEN EXT > 2.5 THEN 'High Extraversion, ' 
             WHEN EXT < 2.5 THEN 'Low Extraversion, ' 
        END,
        CASE WHEN AGR > 2.5 THEN 'High Agreeableness, ' 
             WHEN AGR < 2.5 THEN 'Low Agreeableness, ' 
        END,
        CASE WHEN CSN > 2.5 THEN 'High Conscientiousness, ' 
             WHEN CSN < 2.5 THEN 'Low Conscientiousness, ' 
        END,
        CASE WHEN EST > 2.5 THEN 'High Emotional Stability, ' 
             WHEN EST < 2.5 THEN 'Low Emotional Stability, ' 
        END,
        CASE WHEN OPN > 2.5 THEN 'High Openness' 
             WHEN OPN < 2.5 THEN 'Low Openness' 
        END
    ) AS personality_type FROM 
    (select *, 
    AVG(EXT1+EXT2+EXT3+EXT4+EXT5+EXT6+EXT7+EXT8+EXT9+EXT10) as EXT,
    AVG(AGR1+AGR2+AGR3+AGR4+AGR5+AGR6+AGR7+AGR8+AGR9+AGR10) as AGR,
    AVG(CSN1+CSN2+CSN3+CSN4+CSN5+CSN6+CSN7+CSN8+CSN9+CSN10) as CSN,
    AVG(EST1+EST2+EST3+EST4+EST5+EST6+EST7+EST8+EST9+EST10) as EST,
    AVG(OPN1+OPN2+OPN3+OPN4+OPN5+OPN6+OPN7+OPN8+OPN9+OPN10) as OPN
    from bfpt) as bfpt_updated;

After that, I loaded that into the CSV and uploaded it into the MindsDB files module and created a model regarding the same.

CREATE MODEL mindsdb.bfpt_predict
FROM files
(select * from bfpt)
PREDICT personality_type;

So now by using this, we can create the API in which we pass the data for questions and it gives the personality type in return for this.

const express = require('express');
const app = express();
const bodyParser = require('body-parser');
const MindsDB = require('mindsdb-js-sdk');

// Connect to the MindsDB cloud
(async () => {
    const connection = await MindsDB.connect({
      user: process.env.MINDS_DB_USER,
      password: process.env.MINDS_DB_PASSWORD

    });

    // Load the MindsDB model
    MindsDB.loadModel('bfpt_predict', connection).then(model => {
        console.log('Model loaded successfully!');
    }).catch(err => {
        console.error('Error loading model:', err);
    });

    // Parse JSON request bodies
    app.use(bodyParser.json());

    // Define the API route
    app.post('/predict', (req, res) => {
        const data = req.body;

        // Use the MindsDB model to predict the personality type
        model.predict({
        'EXT1': data.EXT1, 'EXT2': data.EXT2, 'EXT3': data.EXT3, 'EXT4': data.EXT4, 'EXT5': data.EXT5,
        'EXT6': data.EXT6, 'EXT7': data.EXT7, 'EXT8': data.EXT8, 'EXT9': data.EXT9, 'EXT10': data.EXT10,
        'EST1': data.EST1, 'EST2': data.EST2, 'EST3': data.EST3, 'EST4': data.EST4, 'EST5': data.EST5,
        'EST6': data.EST6, 'EST7': data.EST7, 'EST8': data.EST8, 'EST9': data.EST9, 'EST10': data.EST10,
        'AGR1': data.AGR1, 'AGR2': data.AGR2, 'AGR3': data.AGR3, 'AGR4': data.AGR4, 'AGR5': data.AGR5,
        'AGR6': data.AGR6, 'AGR7': data.AGR7, 'AGR8': data.AGR8, 'AGR9': data.AGR9, 'AGR10': data.AGR10,
        'CSN1': data.CSN1, 'CSN2': data.CSN2, 'CSN3': data.CSN3, 'CSN4': data.CSN4, 'CSN5': data.CSN5,
        'CSN6': data.CSN6, 'CSN7': data.CSN7, 'CSN8': data.CSN8, 'CSN9': data.CSN9, 'CSN10': data.CSN10,
        'OPN1': data.OPN1, 'OPN2': data.OPN2, 'OPN3': data.OPN3, 'OPN4': data.OPN4, 'OPN5': data.OPN5,
        'OPN6': data.OPN6, 'OPN7': data.OPN7, 'OPN8': data.OPN8, 'OPN9': data.OPN9, 'OPN10': data.OPN10
        }).then(result => {
        // Return the predicted personality type as a JSON response
        res.json({
            'personality_type': result.personality_type
        });
        }).catch(err => {
        console.error('Error predicting personality type:', err);
        res.status(500).send('Internal server error');
        });
    });

    // Start the server
    const port = 3000;
    app.listen(port, () => {
        console.log(`Server running on port ${port}`);
    });
})();

through this API we can pass the answers to the 50 questions present in the dataset and get the final personality type accordingly. We have to pass the value from 0 to 5 where 0 is the lowest and 5 is the highest.

50 Questions For the Personality Test with respective codes are -

  • EXT1 - I am the life of the party.

  • EXT2 - I don't talk a lot.

  • EXT3 - I feel comfortable around people.

  • EXT4 - I keep it in the background.

  • EXT5 - I start conversations.

  • EXT6 - I have little to say.

  • EXT7 - I talk to a lot of different people at parties.

  • EXT8 - I don't like to draw attention to myself.

  • EXT9 - I don't mind being the centre of attention.

  • EXT10 - I am quiet around strangers.

  • EST1 - I get stressed out easily.

  • EST2 - I am relaxed most of the time.

  • EST3 - I worry about things.

  • EST4 - I seldom feel blue.

  • EST5 - I am easily disturbed.

  • EST6 - I get upset easily.

  • EST7 - I change my mood a lot.

  • EST8 - I have frequent mood swings.

  • EST9 - I get irritated easily.

  • EST10 - I often feel blue.

  • AGR1 - I feel little concern for others.

  • AGR2 - I am interested in people.

  • AGR3 - I insult people.

  • AGR4 - I sympathize with others' feelings.

  • AGR5 - I am not interested in other people's problems.

  • AGR6 - I have a soft heart.

  • AGR7 - I am not interested in others.

  • AGR8 - I take time out for others.

  • AGR9 - I feel others' emotions.

  • AGR10 - I make people feel at ease.

  • CSN1 - I am always prepared.

  • CSN2 - I leave my belongings around.

  • CSN3 - I pay attention to details.

  • CSN4 - I make a mess of things.

  • CSN5 - I get chores done right away.

  • CSN6 - I often forget to put things back in their proper place.

  • CSN7 - I like the order.

  • CSN8 - I shirk my duties.

  • CSN9 - I follow a schedule.

  • CSN10 - I am exacting in my work.

  • OPN1 - I have a rich vocabulary.

  • OPN2 - I have difficulty understanding abstract ideas.

  • OPN3 - I have a vivid imagination.

  • OPN4 - I am not interested in abstract ideas.

  • OPN5 - I have excellent ideas.

  • OPN6 - I do not have a good imagination.

  • OPN7 - I am quick to understand things.

  • OPN8 - I use difficult words.

  • OPN9 - I spend time reflecting on things.

  • OPN10 - I am full of ideas.

Go ahead and try it by yourself #CIY (Code It Yourself).

#MindsDB #MindsDBHackathon

Cheers to Hashnode(hashnode.com)!

For your reference, you can see the project at - github.com/deepam-kapur/Big-Five-Personalit.. (I will try to update the project to refine it, you can also do pull-request if you want to change something)

Did you find this article valuable?

Support Deepam Kapur by becoming a sponsor. Any amount is appreciated!