What are Information signals?
Overview
Information signals represent a category of security-critical markers that identify sensitive, regulated, or confidential data within user communications. These signals are essential for data loss prevention, compliance monitoring, and privacy protection. Effective information signal detection prevents unauthorized disclosure, ensures regulatory compliance, and maintains user trust through appropriate data handling practices.
Named Entities
Definition: Identifiable real-world objects, people, places, organizations, dates, and other specific entities that may carry contextual significance or sensitivity.
Characteristics:
- Proper nouns and specific identifiers
- Contextually significant references
- Entities with potential privacy implications
- Geographic, temporal, and organizational markers
Entity Categories:
- Person Names: Full names, nicknames, aliases, public figures
- Organizations: Companies, institutions, government agencies, non-profits
- Locations: Addresses, cities, countries, landmarks, GPS coordinates
- Dates and Times: Specific dates, time periods, scheduling information
- Products and Services: Brand names, software, proprietary systems
- Events: Conferences, meetings, incidents, historical events
Example Patterns:
- "John Smith from Acme Corporation"
- "Meeting scheduled for January 15th at Google headquarters"
- "The incident occurred at 123 Main Street, New York"
- "Contact Sarah Johnson regarding the Microsoft partnership"
PII/PHI/PCI
(Personally Identifiable Information/Protected Health Information/Payment Card Industry Data)
Definition: Regulated categories of sensitive personal information that require special handling, protection, and compliance measures under various legal frameworks.
PII (Personally Identifiable Information):
- Social Security Numbers (SSN)
- Driver's license numbers
- Passport numbers
- National identification numbers
- Biometric identifiers
- Email addresses and phone numbers
- Full names combined with other identifiers
PHI (Protected Health Information):
- Medical record numbers
- Health plan beneficiary numbers
- Medical device identifiers
- Diagnostic codes and medical conditions
- Treatment information and prescriptions
- Healthcare provider information
- Insurance information related to health
PCI (Payment Card Industry Data):
- Credit card numbers (full or partial)
- CVV/CVC security codes
- Expiration dates combined with card data
- Cardholder names
- Banking account numbers
- Routing numbers and SWIFT codes
Example Patterns:
- PII: "SSN: 123-45-6789", "Driver's License: DL123456789"
- PHI: "Patient ID: MRN-789456", "Diagnosis: ICD-10 E11.9"
- PCI: "Card ending in 1234", "Account number: --****-5678"
Secrets
Definition: Confidential information that provides access to systems, services, or sensitive data, including authentication credentials, cryptographic keys, and proprietary information.
Authentication Secrets
- API Keys and Access Tokens: Service-specific authentication credentials including AWS access keys, Azure API keys, Google API keys, OpenAI API keys, Stripe API keys, Twilio access tokens, and other platform-specific access credentials
- OAuth Tokens: Authorization tokens including OAuth access tokens, refresh tokens, and authorization codes across platforms like GitHub, Google, Dropbox, and social media services
- Service Account Credentials: Specialized authentication for automated services including AWS IAM credentials, Azure service principals, and IBM Cloud service IDs
- Session and Temporary Tokens: Time-limited authentication including AWS STS tokens, Azure shared access signatures, and Vault service tokens
Cryptographic Material
- Private Keys and Certificates: SSH private keys, DSA private keys, and SSL/TLS certificates used for secure communications and digital signatures
- Encryption Keys: Symmetric and asymmetric encryption keys including those embedded in connection strings and service configurations
- JSON Web Tokens (JWT): Signed tokens containing claims and authentication information
- JSON Web Encryption (JWE): Encrypted JSON-based tokens for secure data transmission
System Connection Credentials
- Database Connection Strings: Complete connection credentials for databases including Azure SQL, Azure Cosmos DB, Azure Redis Cache, and MongoDB connections
- Service Connection Strings: Authentication strings for cloud services like Azure Service Bus, Azure IoT Hub, and other messaging platforms
- Webhook URLs: Secure endpoints for automated notifications including Discord webhooks and Slack webhooks
Platform-Specific Secrets
- Cloud Provider Credentials: Authentication materials for major cloud platforms (AWS, Azure, IBM Cloud, Heroku)
- Development and Collaboration Tools: Access credentials for platforms like GitHub, Azure DevOps, and Slack
- Communication and Marketing Services: API keys for services like Twilio, SendGrid, Mailchimp, and Mailgun
- Payment Processing: Secure credentials for financial services like PayPal, Stripe, and Square
- Social Media Integration: Access tokens for platforms like Facebook, Instagram, and Twitter/X
Example Patterns
- AWS:
AKIA...(Access Key ID),aws_secret_access_key=... - API Keys:
sk_live_...(Stripe),xoxb-...(Slack),AIza...(Google) - Connection Strings:
mongodb://user:pass@host:port/db,Server=...;Password=... - OAuth Tokens:
Bearer eyJ...,refresh_token=1//... - SSH Keys:
-----BEGIN PRIVATE KEY-----,ssh-rsa AAAA...