sentence-transformers
Python framework for sentence and text embeddings using transformers.
Works with
0
total installs
0
this week
24.2K
GitHub stars
0
upvotes
Install Skill
Run in your terminal
0
installs
0
this week
24.2K
stars
Installation Guide
How to use sentence-transformers on Cursor
AI-first code editor with Composer
Prerequisites
Before installing skills in Cursor, ensure your development environment meets these requirements:
- ›Cursor installed and configured on your machine
- ›Node.js 16+ with npm — verify with
node --version - ›Active project directory where you want to add
sentence-transformers
Run the install command
Execute the skills CLI command in your project's root directory to begin installation:
Fetches sentence-transformers from davila7/claude-code-templates and configures it for Cursor.
Select Cursor when prompted
The CLI shows a list of agents. Use arrow keys and space to select Cursor:
Verify installation
Confirm successful installation by checking the skill directory location:
Restart Cursor to activate sentence-transformers. Access via /sentence-transformers in your agent's command palette.
Security Notice
We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.
Skills execute code in your environment. Always review source, verify the publisher, and test in isolation before production.
Documentation
Sentence Transformers - State-of-the-Art Embeddings
Python framework for sentence and text embeddings using transformers.
When to use Sentence Transformers
Use when:
- Need high-quality embeddings for RAG
- Semantic similarity and search
- Text clustering and classification
- Multilingual embeddings (100+ languages)
- Running embeddings locally (no API)
- Cost-effective alternative to OpenAI embeddings
Metrics:
- 15,700+ GitHub stars
- 5000+ pre-trained models
- 100+ languages supported
- Based on PyTorch/Transformers
Use alternatives instead:
- OpenAI Embeddings: Need API-based, highest quality
- Instructor: Task-specific instructions
- Cohere Embed: Managed service
Quick start
Installation
pip install sentence-transformers
Basic usage
from sentence_transformers import SentenceTransformer
# Load model
model = SentenceTransformer('all-MiniLM-L6-v2')
# Generate embeddings
sentences = [
"This is an example sentence",
"Each sentence is converted to a vector"
]
embeddings = model.encode(sentences)
print(embeddings.shape) # (2, 384)
# Cosine similarity
from sentence_transformers.util import cos_sim
similarity = cos_sim(embeddings[0], embeddings[1])
print(f"Similarity: {similarity.item():.4f}")
Popular models
General purpose
# Fast, good quality (384 dim)
model = SentenceTransformer('all-MiniLM-L6-v2')
# Better quality (768 dim)
model = SentenceTransformer('all-mpnet-base-v2')
# Best quality (1024 dim, slower)
model = SentenceTransformer('all-roberta-large-v1')
Multilingual
# 50+ languages
model = SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2')
# 100+ languages
model = SentenceTransformer('paraphrase-multilingual-mpnet-base-v2')
Domain-specific
# Legal domain
model = SentenceTransformer('nlpaueb/legal-bert-base-uncased')
# Scientific papers
model = SentenceTransformer('allenai/specter')
# Code
model = SentenceTransformer('microsoft/codebert-base')
Semantic search
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('all-MiniLM-L6-v2')
# Corpus
corpus = [
"Python is a programming language",
"Machine learning uses algorithms",
"Neural networks are powerful"
]
# Encode corpus
corpus_embeddings = model.encode(corpus, convert_to_tensor=True)
# Query
query = "What is Python?"
query_embedding = model.encode(query, convert_to_tensor=True)
# Find most similar
hits = util.semantic_search(query_embedding, corpus_embeddings, top_k=3)
print(hits)
Similarity computation
# Cosine similarity
similarity = util.cos_sim(embedding1, embedding2)
# Dot product
similarity = util.dot_score(embedding1, embedding2)
# Pairwise cosine similarity
similarities = util.cos_sim(embeddings, embeddings)
Batch encoding
# Efficient batch processing
sentences = ["sentence 1", "sentence 2", ...] * 1000
embeddings = model.encode(
sentences,
batch_size=32,
show_progress_bar=True,
convert_to_tensor=False # or True for PyTorch tensors
)
Fine-tuning
from sentence_transformers import InputExample, losses
from torch.utils.data import DataLoader
# Training data
train_examples = [
InputExample(texts=['sentence 1', 'sentence 2'], label=0.8),
InputExample(texts=['sentence 3', 'sentence 4'], label=0.3),
]
train_dataloader = DataLoader(train_examples, batch_size=16)
# Loss function
train_loss = losses.CosineSimilarityLoss(model)
# Train
model.fit(
train_objectives=[(train_dataloader, train_loss)],
epochs=10,
warmup_steps=100
)
# Save
model.save('my-finetuned-model')
LangChain integration
from langchain_community.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-mpnet-base-v2"
)
# Use with vector stores
from langchain_chroma import Chroma
vectorstore = Chroma.from_documents(
documents=docs,
embedding=embeddings
)
LlamaIndex integration
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(
model_name="sentence-transformers/all-mpnet-base-v2"
)
from llama_index.core import Settings
Settings.embed_model = embed_model
# Use in index
index = VectorStoreIndex.from_documents(documents)
Model selection guide
| Model | Dimensions | Speed | Quality | Use Case |
|---|---|---|---|---|
| all-MiniLM-L6-v2 | 384 | Fast | Good | General, prototyping |
| all-mpnet-base-v2 | 768 | Medium | Better | Production RAG |
| all-roberta-large-v1 | 1024 | Slow | Best | High accuracy needed |
| paraphrase-multilingual | 768 | Medium | Good | Multilingual |
Best practices
- Start with all-MiniLM-L6-v2 - Good baseline
- Normalize embeddings - Better for cosine similarity
- Use GPU if available - 10× faster encoding
- Batch encoding - More efficient
- Cache embeddings - Expensive to recompute
- Fine-tune for domain - Improves quality
- Test different models - Quality varies by task
- Monitor memory - Large models need more RAM
Performance
| Model | Speed (sentences/sec) | Memory | Dimension |
|---|---|---|---|
| MiniLM | ~2000 | 120MB | 384 |
| MPNet | ~600 | 420MB | 768 |
| RoBERTa | ~300 | 1.3GB | 1024 |
Resources
- GitHub: https://github.com/UKPLab/sentence-transformers ⭐ 15,700+
- Models: https://huggingface.co/sentence-transformers
- Docs: https://www.sbert.net
- License: Apache 2.0
List & Monetize Your Skill
Submit your Claude Code skill and start earning
Use Cases
User Story & Requirements Generation
Create detailed user stories, acceptance criteria, and feature specs
Example
Generate user stories for 'password reset feature' with acceptance criteria, edge cases, and test scenarios
Reduce spec writing time by 50%, ensure comprehensive coverage
Competitive Analysis
Research competitors, compare features, identify gaps
Example
Analyze 5 competitor products, create feature comparison matrix, suggest differentiation opportunities
Complete competitive research in 2 hours instead of 2 days
Roadmap Prioritization
Evaluate features using frameworks (RICE, ICE, Kano) and create prioritized backlogs
Example
Score 20 feature ideas using RICE framework, generate prioritized roadmap with rationale
Make data-driven prioritization decisions faster
Stakeholder Communication
Draft PRDs, status updates, and stakeholder presentations
Example
Create executive summary of Q3 roadmap, monthly progress report, feature launch announcement
Save 3-5 hours/week on communication overhead
Implementation Guide
Prerequisites
- ›Claude Desktop or compatible AI client
- ›Access to product documentation and roadmap tools (Jira, Notion, etc.)
- ›Understanding of product management frameworks (RICE, Jobs-to-be-Done, etc.)
- ›Stakeholder contact information and communication channels
Time Estimate
30-60 minutes to see productivity improvements
Steps
- 1Install product management skill
- 2Start with user story generation for known feature
- 3Progress to competitive analysis: research 2-3 competitors
- 4Use for roadmap prioritization: apply RICE/ICE scoring
- 5Draft stakeholder communications and refine based on feedback
- 6Build template library for recurring PM tasks
- 7Share effective prompts with product team
Common Pitfalls
- ⚠Not validating competitive research—verify facts before sharing
- ⚠Accepting user stories without involving engineering team
- ⚠Over-relying on frameworks without qualitative judgment
- ⚠Not customizing outputs to company culture and communication style
- ⚠Skipping stakeholder validation of generated requirements
Best Practices
✓ Do
- +Validate research and competitive analysis with real data
- +Collaborate with engineering when generating technical requirements
- +Customize frameworks and templates to your company context
- +Use skill for first drafts, refine with stakeholder input
- +Document successful prompt patterns for PM tasks
- +Combine AI efficiency with human judgment and intuition
✗ Don't
- −Don't publish competitive analysis without fact-checking
- −Don't finalize user stories without engineering review
- −Don't make prioritization decisions solely on AI scoring
- −Don't skip customer validation of generated requirements
- −Don't ignore company-specific context and culture
💡 Pro Tips
- ★Provide context: company goals, constraints, customer feedback
- ★Ask for alternatives: 'Show 3 ways to prioritize this roadmap'
- ★Request stakeholder-specific formatting: 'Executive summary vs. engineering spec'
- ★Use skill for 70% generation + 30% customization to company needs
When to Use This
✓ Use when
Use for user story writing, competitive research, roadmap prioritization, stakeholder communication, and PRD drafting. Best for reducing repetitive documentation and research work.
✗ Avoid when
Avoid for strategic product vision (requires deep customer empathy), pricing decisions (needs market and financial expertise), or when face-to-face customer discovery is more valuable than speed.
Learning Path
- 1Basic: user stories, feature specs, status updates
- 2Intermediate: competitive analysis, prioritization frameworks, PRDs
- 3Advanced: product strategy, go-to-market planning, OKR setting
- 4Expert: product vision, market positioning, business model innovation
Related Skills
ml-paper-writing
66davila7/claude-code-templates
grill-me
365mattpocock/skills
premortem
196parcadei/continuous-claude-v3
deslop
116cursor/plugins
framer-motion
97pproenca/dot-skills
write-a-prd
88mattpocock/skills
Reviews
- SShikha Mishra★★★★★Dec 16, 2024
sentence-transformers is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- AAditi Sanchez★★★★★Dec 8, 2024
We added sentence-transformers from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- LLi Nasser★★★★★Nov 27, 2024
Keeps context tight: sentence-transformers is the kind of skill you can hand to a new teammate without a long onboarding doc.
- LLi Park★★★★★Nov 7, 2024
sentence-transformers is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- AAmelia Martin★★★★★Oct 26, 2024
Useful defaults in sentence-transformers — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- CChen Ndlovu★★★★★Oct 18, 2024
sentence-transformers has been reliable in day-to-day use. Documentation quality is above average for community skills.
- AArya Abbas★★★★★Sep 13, 2024
We added sentence-transformers from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- RRahul Santra★★★★★Sep 9, 2024
Keeps context tight: sentence-transformers is the kind of skill you can hand to a new teammate without a long onboarding doc.
- PPratham Ware★★★★★Aug 28, 2024
sentence-transformers has been reliable in day-to-day use. Documentation quality is above average for community skills.
- CChinedu Thompson★★★★★Aug 4, 2024
Solid pick for teams standardizing on skills: sentence-transformers is focused, and the summary matches what you get after install.
showing 1-10 of 29
Discussion
Comments — not star reviews- No comments yet — start the thread.