Tag Archives: ArtificialIntelligence

Learnings from assignments / open book exams at Indian Institute of Technology Gandhinagar – Executive Masters in Data Science for Decision Making

One important lesson I learned while working with spatio-temporal graph data on the METR-LA dataset during my Executive Masters open-book assignment:

Do not keep switching between Claude, ChatGPT, Perplexity, Gemini, and other LLMs or AI tools during the execution stage. This lesson has repeated itself in the two years throughout the Executive Masters whenever we have been allowed to use LLMs.

My learning:

• Different LLMs reason differently

• They are trained and fine-tuned differently

• They suggest different libraries, assumptions, fixes, and coding styles

• Mixing their guidance during debugging can create unnecessary chaos

• What looks like “more intelligence” can become “more confusion”

• Multi-model thinking is useful during brainstorming

• It helps in debating, exploring, comparing, and expanding ideas

• But once execution begins, consistency matters more than variety

• Pick one model and work through the problem step by step

• Ask it to explain, debug, simplify, correct, and iterate

• Stay with one reasoning path until the solution stabilizes

My conclusion:

Use multiple LLMs for exploration.

Use one LLM for execution.

Mixing models during ideation can create insight.

Mixing models during implementation can create chaos.

This is especially true in technical work involving data science, graph ML, spatio-temporal modeling, package dependencies, tensor shapes, runtime environments, and debugging.

Progress comes from disciplined iteration, not tool-hopping.

Note: Enhanced / compiled with help of AI / LLMs

Dimensions for Artificial Intelligence / GenAI / LLMs / Deep Learning / Neural Networks / Data Science to ponder on – Part 1-Assisted by AI – ChatGPT


🧠 1. Model Performance & Quality

Beyond accuracy:

  • Precision / Recall / F1-score
  • ROC-AUC
  • Calibration (probability correctness)
  • Generalization ability
  • Robustness (noise, adversarial inputs)
  • Stability (variance across runs)
  • Overfitting / Underfitting control
  • Latency (response time)
  • Throughput (requests per second)

⚖️ 2. Responsible AI / Ethics

Along with fairness, bias, explainability, interpretability:

  • Accountability
  • Transparency
  • Non-discrimination
  • Inclusiveness
  • Human oversight / Human-in-the-loop
  • Ethical alignment
  • Value alignment (especially for LLMs)
  • Safety (harm prevention)

🔐 3. Security & Privacy

Critical for enterprise and GenAI:

  • Data privacy (PII protection)
  • Differential privacy
  • Federated learning capability
  • Model security (model theft, extraction)
  • Prompt injection resistance (LLMs)
  • Data leakage prevention
  • Adversarial robustness
  • Access control & authentication

📊 4. Data Quality & Governance

Often more important than model itself:

  • Data completeness
  • Data consistency
  • Data lineage
  • Data drift detection
  • Concept drift detection
  • Bias in training data
  • Data freshness
  • Label quality
  • Auditability

⚙️ 5. Model Lifecycle & MLOps

Operational excellence:

  • Reproducibility
  • Versioning (data + model)
  • Monitoring (real-time + batch)
  • Model retraining strategy
  • Deployment reliability
  • Rollback capability
  • CI/CD for ML pipelines
  • Observability (logs, metrics, traces)

🧩 6. LLM / GenAI Specific Parameters

Very important for your GenAI work:

  • Hallucination rate
  • Faithfulness (groundedness to source)
  • Context retention (long context handling)
  • Instruction following
  • Toxicity / harmful output control
  • Prompt sensitivity
  • Response consistency
  • Token efficiency (cost optimization)
  • Alignment with system prompts / policies
  • Retrieval quality (RAG precision/recall)

🧪 7. Evaluation & Testing

For enterprise-grade systems:

  • Benchmarking (standard datasets)
  • Stress testing
  • Edge case coverage
  • Scenario testing
  • A/B testing
  • Human evaluation (subjective scoring)
  • Red teaming (especially for GenAI)

🌐 8. Business & Product Metrics

Often ignored in technical discussions:

  • ROI / Cost-benefit
  • User satisfaction
  • Adoption rate
  • Time saved / productivity gain
  • Decision impact quality
  • Revenue impact
  • Risk reduction

🧭 9. Governance & Compliance

Especially relevant in India (DPDP Act etc.):

  • Regulatory compliance
  • Audit trails
  • Model documentation (Model Cards)
  • Explainability for regulators
  • Consent management
  • Data residency

🧠 Quick Memory Framework

You can compress everything into:

👉 FAPES-DLMGB

  • Fairness & Ethics
  • Accuracy & Performance
  • Privacy & Security
  • Explainability
  • Scalability & Stability
  • Data Quality
  • Lifecycle (MLOps)
  • Monitoring
  • Governance
  • Business Impact

Reference frameworks:

  • NIST AI Risk Management Framework
  • ISO/IEC 42001

Note: Enhanced / compiled with help of AI / LLMs